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Methods and Compositions for Sensitive and Rapid, Functional 
Identification of Genomic Polynucleotides 
and Use For Cellular Assays in Drug Discovery 

Cross Reference to Related Applications 

This application claims the benefit of an earher fiHng date to a patent application 
identified as United States patent application No.: 08/719,697, filed September 26, 1996, 
entitled "Methods and compositions for sensitive and rapid, functional identification of 
genomic polynucleotides and secondary screening capabilities" of which the present 
application as a Continuation-in-Part and which is incorporated herein by reference. 

Technical Field 

The present invention generally relates to methods and compositions for the 
identification of useful and functional portions of the genome and compounds for 
modulating such portions of the genome, particularly the identification of proteins that 
are directly or indirectly transcriptionally regulated and compounds for regulating such 
proteins, either directly or indirectly. 

Background 

The identification and isolation of useful portions of the genome requires 
extensive expenditure of time and financial resources. Currently, many genome projects 
use various strategies to reduce cloning and sequencing times. While genome projects 
rapidly expand the database of genetic material, such projects often lack the ability to 
integrate the information with the biology of the cell or organism from which the genes 
were isolated. In some instances, coding regions of newly isolated genes reveal sequence 
homology to other genes of known function. This type of analysis can, at best, provide 
clues as to the possible relationships between different genes and proteins. Genomic 
projects in general, however, suffer from the inability to rapidly and directly isolate, and 

M I > 
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identify specific, yet unknown, genes associated with particular a biological process or 
processes. 

The evaluation of the function of genes identified from genomic sequencing 
projects requires cloning the discovered gene into an expression system suitable for 

5 functional screening. Transferring the discovered gene into a functional screening system 
requires additional expenditure of time and resources without a guarantee that the correct 
screening system was chosen. Since the function of the discovered gene is often 
unknown or only surmised by inference to structurally related genes, the chosen 
screening system may not have any relationship to the biological function of the gene. 

10 For example a gene may encode a protein that is structurally homologous to the beta- 
adrenergic receptor and have a dissimilar function. Further, if negative results are 
obtained in the screen, it can not be easily determined whether 1) the gene or gene 
product is not functioning properly in the screening assay or 2) the gene or gene product 
is directly or indirectly involved in the biological process being assayed by the screening 

15 system. 

Consequently, there is a need to provide methods and compositions for rapidly 
isolating portions of genomes associated with a known biological process and to screen 
such portions of genomes for activity without the necessity of transferring the gene of 
interest into an additional screening system. 

20 

Brief Description of the Figures 

FIG. 1 shows a comparison between an application of a prior art reporter gene 
with methods described herein, and one embodiment of the invention. The prior art uses 
the b-gal reporter and requires the establishment of clones prior to expression analysis. 
25 One embodiment of this invention allows for the rapid identification of living cell clones 
from large multiclonal populations of BLEC (beta-lactamase expression construct) 
integrated cells. This is a significant advancement over the prior art, which requires the 
analysis of individual clones followed by the retrieving of selected clone from a duplicate 
clonal stock of living cells. 
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FIG. 2 shows a representation of how one embodiment of the invention reports 
the expression of a pathway within a cell and can be used for screening. 

FIG. 3 shows a schematic plasmid map of the BLEC-1. 

FIG. 4 shows the FACS analysis of a population of genomically BLEC 
5 integrated clones. Individually cells are plotted by fluorescent emission properties at 400 
rnn excitation. The x axis represents green emission (530 nm). The y axis represents blue 
emission (465 nm). Cells with a high blue/green ration will appear blue in color and cells 
with a low blue/green ratio will appear green in color. A) Unselected rnulticlonal 
population of BLEC integrated RBL-1 cell clones. B) Population of clones sorted from 
10 3A (Rl) that were cultured for an additional 7 days and resorted. C) Population from 3B 
with addition of luM ionomycin for 12 hours prior to sorting. 

Summary 

The present invention recognizes that P-lactamase polynucleotides can be 
15 effectively used in living eukaryotic cells to functionally identify active portions of a 

genome directly or indirectly associated with a biological process. The present invention 
also recognizes for the first time that p-lactamase activity can be measured using 
membrane permeant substrates in living cells incubated with a test chemical that directly 
or indirectly interacts with a portion of the genome having an integrated (i-lactamase 
20 polynucleotide. The present invention, thus, permits the rapid identification and isolation 
of genomic polynucleotides indirectly or directly associated with a defined biological 
process and identification of compounds that modulate such processes and regions of the 
genome. Because the identification of active genomic polynucleotides is permitted in 
living cells, further functional characterization can be conducted using the same cells, and 
25 optionally, the same screening assay. The ability to functionally screen immediately after 
the rapid identification of a functionally active portion of a genome, without the necessity 
of transferring the identified portion of the genome into a secondary screening system, 
represents, among other things, a distinct advantage over an application of a prior art 
reporter gene with the methods described herein, as shown in FIG. 1 . 
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The invention provides for a method of identifying portions of a genome, e.g. 
genomic polynucleotides, in a living cell using a polynucleotide encoding a protein with 
P-lactamase activity that can be detected with a membrane penmeant P-lactamase 
substrate. Typically, the method involves inserting a polynucleotide encoding a protein 
with p-lactamase activity into the genome of an organism using any method known in the 
art, developed in the future or described herein. Usually, a (3-lactamase expression 
construct will be used into integrate a P-lactamase polynucleotide into a eukaryotic 
genome, as described herein. The cell, such as a eukaryoiic cell, is usually contacted with 
a predetermined concentration of a modulator, either before or after integration of the P- 
lactamase polynucleotide. P-lactamase activity is usually then measured inside the living 
cell, preferably with fluorescent, membrane permeanl P-laciamase substrates that are 
transformed by the cell into membrane impermeant P-laclamase substrates as described 
herein. 

The invention also provides for a method of identifying proteins or compounds 
that directly or indirectly modulate a genomic polynucleotide. Generally, the method 
comprises inserting a P-lactamase expression construct into an eukaryotic genome, 
usually non-yeast, contained in at least one living cell, contacting the cell with a 
predetermined concentration of a modulator, and detecting p-lactamase activity in the 
cell. 

The invention also provides for a method of screening compounds with an active 
genomic polynucleotide that comprises: 1) optionally contacting a multiclonal population 
of cells with a first test chemical prior to separating said cells by a FACS, 2) separating 
by a FACS said multiclonal population of cells into p-lactamase expressing cells and non 
P-lactamase expressing cells, wherein said P-lactamase expressing cells have a detectable 
difference in cellular fluorescence properties compared to non- p-lactamase expressing 
cells, 3) contacting either population of cells with the same or a different test chemical, 
and 4) optionally repeating step (2), wherein said multi-clonal population of cells 
comprises eukaryotic cells having a p-lactamase expression construct integrated into a 
genome of said eukaryotic cells and a membrane permanent P-lactamase substrate 


wo 98/13353 


PCT/US97/17395 


transformed inside said cells to a membrane impermeant (i-lactamase substrate. The 
steps of this method can be repeated to permit additional characterization of identified 
clones. 

The invention also includes powerful methods and compositions for identifying 
5 physiologically relevant cellular pathways and proteins of interest of known, unknown or 
partially known function. As shown in FIG. 2 a pathway may have more than one major 
intracellular signal. Two major intracellular pathways are shown ("A" and "B"). Each 
intracellular signal pathway may also have multiple branches. Each arm is shown as 
having three signaling pathways (Al, A2, and A3; and Bl, B2, and B3). By generating a 

10 library of clones with a -lactamase expression construct, genomic polynucleotides for 
each signal pathway can be tagged or reported by the expression of -lactamase. 
Pathways not effected by the modulator (shown as CI , C2, and C3) are also tagged with 
-lactamase expression construct. Because the modulator only modulates the expression 
of pathways Al, A2, A3, Bl, B2, and B3, only clones corresponding to these genomic 

15 integration sites are identified as being responsive to the modulator. Clones 

corresponding to sites CI, C2, and C3 remain unaltered and are not responsive to the 
modulator. Any individual, modulated clone can be immediately isolated, if not already 
isolated, and used for a drug discovery assay to screen test chemicals for activity for 
modulating the reported pathway, as described herein. 

20 The invention also includes tools for pathway identification and drug discovery 

that can be applied to a number of targets of interest and therapeutic areas including, 
proteins of interest, physiological responses even in the absence of a definitive target (e.g. 
immune response, signal transduction, neuronal function and endocrine function), viral 
targets, and orphan proteins. 


25 
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Detailed Description of the Invention 

Definitions 

Unless defined otherwise, all technical and scientific terms used herein have the 
same meaning as commonly understood by one of ordinary skill in the art to which this 
5 invention belongs. Generally, the nomenclature used herein and the laboratory 
procedures in cell culture, molecular genetics, and nucleic acid chemistry and 
hybridization described below are those well known and commonly employed in the art. 
Standard techniques are used for recombinant nucleic acid methods, polynucleotide 
synthesis, and microbial culture and transformation (e.g., electroporation, lipofection). 
10 Generally, enzymatic reactions and purification steps are performed according to the 
manufacturer's specifications. The techniques and procedures are generally performed 
according to conventional methods in the art and various general references (see 
generally, Sambrook et al. Molecular Cloning: A Laboratory Manual, 2d ed. (1989) Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., which is incorporated herein 
15 by reference) which are provided throughout this document. The nomenclature used 

herein and the laboratory procedures in analytical chemistry, organic synthetic chemistry, 
and phaimaceutical formulation described below are those well known and commonly 
employed in the art. Standard techniques are used for chemical syntheses, chemical 
analyses, pharmaceutical formulation and delivery, and treatment of patients. As 
20 employed throughout the disclosure, the following terms, unless otherwise indicated, 
shall be understood to have the following meanings: 

"Fluorescent donor moiety" refers to a fluorogenic compound or part of a 
compound (including a radical) which can absorb energy and is capable of transferring 
the energy to another fluorogenic molecule or part of a compound. Suitable donor 
25 fluorogenic molecules include, but are not limited to, coumarins and related dyes 

xanthene dyes such as fluoresceins, rhodols, and rhodamines, resorufms, cyanine dyes, 
bimancs, acridines, isoindoles, dansyl dyes, aminophthalic hydrazides such as luminol 
and isoluminol derivatives, aminophthalimides, aminonaphthalimides, 
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aminobenzofiirans, aminoquinolines, dicyanohydroquinones, and europium and terbium 
complexes and related compounds. 

"Quencher" refers to a chromophoric molecule or part of a compound that is 
capable of reducing the emission from a fluorescent donor when attached to the donor. 
5 Quenching may occur by any of several mechanisms including fluorescence resonance 
energy transfer, photoinduced electron transfer, paramagnetic enhancement of 
intersystem crossing. Dexter exchange coupling, and exciton coupling such as the 
formation of dark complexes. 

"Acceptor" refers to a quencher that operates via fluorescence resonance energy 
10 transfer Many acceptors can re-emit the transferred energy as fluorescence. Examples 
include coumarins and related fluorophores, xanthenes such as fluoresceins, rhodols, and 
rhodamines, resorufms, cyanines, difluoroboradiazaindacenes, and phthalocyanines. 
Other chemical classes of acceptors generally do not re-emit the transferred energy. 
Examples include indigos, benzoquinones, anthraquinones, azo compounds, nitro 
15 compounds, indoanilines, di- and triphenylmethanes. 

*'Dye" refers to a molecule or part of a compound that absorbs specific 
frequencies of light, including but not limited to ultraviolet light. The terms "dye" and 
"chromophore" are synonymous. 

"Fluorophore" refers to a chromophore that fluoresces. 
20 "Membrane-permeant derivative" refers a chemical derivative of a compound of 

that increases membrane permeability of the compound. These derivatives are made 
better able to cross cell membranes, i.e. membrane permeant, because hydrophilic groups 
are masked to provide more hydrophobic derivatives. Also, the masking groups are 
designed to be cleaved from the fluorogenic substrate within the cell to generate the 
25 derived substrate intracellularly. Because the substrate is more hydrophilic than the 
membrane permeant derivative it is now trapped within the cells. 

"Isolated polynucleotide" refers to a polynucleotide of genomic, cDNA, or 
synthetic origin or some combination there of, which by virtue of its origin the "isolated 
polynucleotide" (1) is not associated with the cell in which the "isolated polynucleotide" 
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is found in nature, or (2) is operably linked to a polynucleotide which it is not linked to in 
nature. 

"Isolated protein" refers to a protein of cDNA, recombinant RNA, or synthetic 
origin or some combination thereof, which by virtue of its origin the "isolated protein" (1 ) 
is not associated with proteins foimd it is normally found with in nature, or (2) is isolated 
from the cell in which it normally occurs or (3) is isolated free of other proteins from the 
same cellular source, e.g. free of human proteins, or (4) is expressed by a cell from a 
different species, or (5) does not occur in nature. 

"Polypeptide" as used herein as a generic term to refer to native protein, 
fragments, or analogs of a polypeptide sequence. Hence, native protein, fragments, and 
analogs are species of the polypeptide genus. Preferred, p-lactamase polypeptides 
include those with the polypeptide sequence represented in the SEQUENCE ID. 
LISTING and any other polypeptide or protein having similar P-lactamase activity as 
measured by one or more of the assays described herein, (i-lactamase polypeptide or 
proteins can include any protein having sufficient activity for detection in the assays 
described herein. 

"Naturally-occurring" as used herein, as applied to an object, refers to the fact that 
an object can be found in nature. For example, a polypeptide or polynucleotide sequence 
that is present in an organism (including viruses) that can be isolated from a source in 
nature and which has not been intentionally modified by man in the laboratory is 
naturally-occurring. 

"Operably linked" refers to a juxtaposition wherein the components so described 
are in a relationship permitting them to function in their intended manner. A control 
sequence "operably linked" to a coding sequence is ligated in such a way that expression 
of the coding sequence is achieved under conditions compatible with the control 
sequences. 

"Control sequence" refers to polynucleotide sequences which are necessary to 
effect the expression of coding and non-coding sequences to which they are ligated. The 
nature of such control sequences differs depending upon the host organism; in 
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prokaryotes, such control sequences generally include promoter, ribosomal binding site, 
and transcription termination sequence; in eukaryotes, generally, such control sequences 
include promoters and transcription termination sequence. The temi "control sequences" 
is intended to include, at a minimum, components whose presence can influence 
5 expression, and can also include additional components whose presence is advantageous, 
for example, leader sequences and fusion partner sequences. 

"Polynucleotide" refers to a polymeric form of nucleotides of at least 10 bases in 
length, either ribonucleotides or deoxynucleotides or a modified form of either type of 
nucleotide. The term includes single and double stranded forms of DNA. "Genomic 

1 0 polynucleotide" refers to a portion of a genome. **Active genomic polynucleotide" or 

"active portion of a genome" refer to regions of a genome that can be up regulated, down- 
regulated or both, either directly or indirectly, by a biological process. "Directly," in the 
context of a biological process or processes, refers to direct causation of a process that 
does not require intermediate steps, usually caused by one molecule contacting or binding 

] 5 to another molecule (the same type or different type of molecule). For example, molecule 
A contacts molecule B, which causes molecule B to exert effect X that is part of a 
biological process. "Indirectly," in the context of a biological process or processes, refers 
to indirect causation that requires intermediate steps, usually caused by two or more 
direct steps. For example, molecule A contacts molecule B to exert effect X which in 

20 turn causes effect Y. 

"p-lactamase polynucleotide" refers to a polynucleotide encoding a protein with 
P-lactamase activity. Preferably, the protein with P-lactamase activity can measured be 
in a FACS at 22°degrees using a CCF2-AM p-lactamase substrate at a level of about 
1 ,000 such protein molecules or less per cell. More preferably, the protein with P- 

25 lactamase activity can measured be in a FACS at 22° degrees using a CCF2-AM p- 

lactamase substrate at a level of about 300 to 1,000 such protein molecules per cell. More 
preferably, the protein with P-lactamase activity can measured be in a FACS at 22° 
degrees using a CCF2-AM p-lactamasesubstrate at a level of about 25 to 300 such protein 
molecules per cell. Proteins with P-Iactamaseactivity that require more than 1,000 
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molecules of such protein per cell for detection with a FACS at 22^ degrees using a 
CCF2-AM P-lactamase substrate can be used and preferably have at least 5% of the 
activity of the protein v^ith SEQ. ID. NO,: 1. 

"Sequence homology" refers to the proportion of base matches between two 

5 nucleic acid sequences or the proportion amino acid matches between two amino acid 
sequences. When sequence homology is expressed as a percentage, e.g., 50%, the 
percentage denotes the proportion of matches over the length of sequence from a desired 
sequence (e.g. p-lactamase sequences, such as SEQ. ID. NO.: 1) that is compared to 
some other sequence. Gaps (in either of the two sequences) are permitted to maximize 

10 matching; gap lengths of 1 5 bases or less are usually used, 6 bases or less are preferred 
with 2 bases or less more preferred. When using oligonucleotides as probes or treatments 
the sequence homology between the target nucleic acid and the oligonucleotide sequence 
is generally not less than 17 target base matches out of 20 possible oligonucleotide base 
pair matches (85%); preferably not less than 9 matches out of 10 possible base pair 

15 matches (90%), and most preferably not less than 19 matches out of 20 possible base pair 
matches (95%). 

"Selectively hybridize" refers to detectably and specifically bind. 
Polynucleotides, oligonucleotides and fragments thereof selectively hybridize to target 
nucleic acid strands, under hybridization and wash conditions that minimize appreciable 

20 amounts of detectable binding to nonspecific nucleic acids. High stringency conditions 
can be used to achieve selective hybridization conditions as known in the art and 
discussed herein. Generally, the nucleic acid sequence homology between the 
polynucleotides, oligonucleotides, and fragments thereof and a nucleic acid sequence of 
interest will be at least 30%, and more typically with preferably increasing homologies of 

25 at least about 40%, 50%, 60%, 70%, and 90%. 

Typically, hybridization and washing conditions are performed at high stringency 
according to conventional hybridization procedures. Positive clones are isolated and 
sequenced. For illustration and not for limitation, a full-length polynucleotide 
corresponding to the nucleic acid sequence of SEQ. ID.NO. 1 may be labeled and used as 
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a hybridization probe to isolate genomic clones from a the appropriate target library in 
XEMBL4 or ^GEMl 1 (Promega Corporation, Madison, Wisconsin); typical 
hybridization conditions for screening plaque lifts (Benton and Davis (1978) Science 196: 
180) can be: 50% formamide, 5 x SSC or SSPE, 1-5 x Denhardfs solution, 0.1-1% SDS, 
100-200 Jig sheared heterologous DNA or tRNA, 0-10% dextran sulfate, 1 xlO* to 1 x 10' 
cpm/ml of denatured probe with a specific activity of about 1x10' cpni/|ag, and 
incubation at Al^C for about 6-36 hours. Prehybridization conditions are essentially 
identical except that probe is not included and incubation time is typically reduced. 
Washing conditions are typically 1-3 x SSC, 0.1-1% SDS, SO-TO^C with change of wash 
solution at about 5-30 minutes. Cognate sequences, including allelic sequences, can be 
obtained in this manner. 

Two amino acid sequences are homologous if there is a partial or complete 
identity between their sequences. For example, 85% homology means that 85% of the 
amino acids are identical when the two sequences are aligned for maximum matching. 
Gaps (in either of the two sequences being matched) are allowed in maximizing 
matching; gap lengths of 5 or less are preferred with 2 or less being more preferred. 
Alternatively and preferably, two protein sequences (or polypeptide sequences derived 
from them of at least 30 amino acids in length) are homologous, as this term is used 
herein, if they have an alignment score of at more than 5 (in standard deviation units) 
using the program ALIGN with the mutation data matrix and a gap penalty of 6 or 
greater. See Dayhoff, M.O., in Atlas of Protein Sequence and Structure, 1972, volume 5, 
National Biomedical Research Foundation, pp. 101-1 10, and Supplement 2 to this 
volume, pp. 1 -10. The two sequences or parts thereof are more preferably homologous if 
their amino acids are greater than or equal to 30% identical when optimally aligned using 

the ALIGN program. 

"Corresponds to" refers to a polynucleotide sequence is homologous (i.e., is 
identical, not strictly evolutionarily related) to all or a portion of a reference 
polynucleotide sequence, or that a polypeptide sequence is identical to all or a portion of 
a reference polypeptide sequence. In contradistinction, the term "complementary to" is 
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used herein to mean that the complementary sequence is homologous to all or a portion of 
a reference polynucleotide sequence. For illustration, the nucleotide sequence 'TAT AC" 
corresponds to a reference sequence "TATAC" and is complementary to a reference 
sequence "GTATA". 

5 The following terms are used to describe the sequence relationships between two 

or more polynucleotides: "reference sequence," "comparison window," "sequence 
identity," "percentage of sequence identity," and "substantial identity." A "reference 
sequence" is a defined sequence used as a basis for a sequence comparison; a reference 
sequence may be a subset of a larger sequence, for example, as a segment of a full-length 

10 cDNA or gene sequence given in a sequence listing such as a SEQ. ID. NO.: 1, or may 
comprise a complete cDNA or gene sequence. Generally, a reference sequence is at least 
20 nucleotides in length, frequently at least 25 nucleotides in length, and often at least 50 - 
nucleotides in length. Since two polynucleotides may each (1) comprise a sequence (i.e., 
a portion of the complete polynucleotide sequence) that is similar between the two 

15 polynucleotides, and (2) may further comprise a sequence that is divergent between the 
two polynucleotides, sequence comparisons between two (or more) polynucleotides are 
typically performed by comparing sequences of the two polynucleotides over a 
"comparison window" to identify and compare local regions of sequence similarity. A 
"comparison window", as used herein, refers to a conceptual segment of at least 20 

20 contiguous nucleotide positions wherein a polynucleotide sequence may be compared to a 
reference sequence of at least 20 contiguous nucleotides and wherein the portion of the 
polynucleotide sequence in the comparison window may comprise additions or deletions 
(i.e., gaps) of 20 percent or less as compared to the reference sequence (which does not 
comprise additions or deletions) for optimal alignment of the two sequences. Optimal 

25 alignment of sequences for aligning a comparison window may be conducted by the local 
homology algorithm of Smith and Watennan (1981) Adv. Appl. Math. 2: 482, by the 
homology alignment algorithm of Needleman and Wunsch (1970) J. Mol, Biol. 48: 443, 
by the search for similarity method of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. 
(U.S.A.) 85: 2444, by computerized implementations of these algorithms (GAP, 
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BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 
7.0, Genetics Computer Group, 575 Science Dr., Madison, WI), or by inspection, and the 
best alignment (i.e., resulting in the highest percentage of homology over the comparison 
window) generated by the various methods is selected. The term "sequence identity" 
5 means that two polynucleotide sequences are identical (i.e., on a nucleotide-by-nucleotide 
>asis) over the window of comparison. The term "percentage of sequence identity" is 
calculated by comparing two optimally aligned sequences over the window of 
comparison, determining the number of positions at which the identical nucleic acid base 
(e.g.. A, T, C, G, U, or I) occurs in both sequences to yield the number of matched 

10 positions, dividing the number of matched positions by the total number of positions in 
the window of comparison (i.e., the window size), and multiplymg the result by 100 to 
yield the percentage of sequence identity. The terms "substantial identity" as used herein 
denotes a characteristic of a polynucleotide sequence, wherein the polynucleotide 
comprises a sequence that has at least 30 percent sequence identity, preferably at least 50 

15 to 60 percent sequence identity, more usually at least 60 percent sequence identity as 
compared to a reference sequence over a comparison window of at least 20 nucleotide 
positions, frequently over a window of at least 25-50 nucleotides, wherein the percentage 
of sequence identity is calculated by comparing the reference sequence to the 
polynucleotide sequence which may include deletions or additions which total 20 percent 

20 or less of the reference sequence over the window of comparison. 

As applied to polypeptides, the term "substantial identity" means that two peptide 
sequences, when optimally aligned, such as by the programs GAP or BESTFIT using 
default gap weights, share at least 30 percent sequence identity, preferably at least 40 
percent sequence identity, more preferably at least 50 percent sequence identity, and most 

25 preferably at least 60 percent sequence identity. Preferably, residue positions, which are 
not identical, differ by conservative amino acid substitutions. Conservative amino acid 
substitutions refer to the interchangeability of residues having similar side chains. For 
example, a group of amino acids having aliphatic side chains is glycine, alanine, vahne, 
leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is 
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serine and threonine; a group of amino acids having amide-containing side chains is 
asparagine and glutamine; a group of amino acids having aromatic side chains is 
phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains 
is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing 
5 side chains is cysteine and methionine. Preferred conservative amino acids substitution 
^groups are: vaHne-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine- 
valine, glutamic-aspartic, and asparagine-glutamine. 

"Polypeptide fragment" refers to a polypeptide that has an amino-terminal and/or 
carboxy-terminal deletion, but where the remaining amino acid sequence is usually 

10 identical to the corresponding positions in the naturally-occurring sequence deduced, for 
example, from a full-length cDNA sequence (e.g., the sequence shown in SEQ. ID. NO.: 
1). "P-lactamase polypeptides fragment" refers to a polypeptide that is comprised of a 
segment of at least 25 amino acids that has substantial identity to a portion of the deduced 
amino acid sequence shown in SEQ. ID. NO.:l and which has at least one of the 

15 following properties: (1 ) specific binding to a P-lactamase substrate, preferably 
cephalosporin, under suitable binding conditions, or (2) the ability to effectuate 
enzymatic activity, preferably cephalosporin backbone cleavage activity, when expressed 
in a mammalian cell. Typically, analog polypeptides comprise a conservative amino acid 
substitution (or addition or deletion) with respect to the naturally occurring sequence, 

20 Analogs typically are at least 300 amino acids long, preferably at least 500 amino acids 
long or longer, most usually being as long as full-length naturally-occurring polypeptide. 

"Modulation " refers to the capacity to either enhance or inhibit a functional 
property of a biological activity or process (e.g., enzyme activity or receptor binding). 
Such enhancement or inhibition may be contingent on the occurrence of a specific event, 

25 such as activation of a signal transduction pathway, and/or may be manifest only in 
particular cell types. 

The term "modulator" refers to a chemical (naturally occurring or non-naturally 
occurring), such as a biological macromolecule (e.g. nucleic acid, protein, non-pjsptide, or 
organic molecule), or an extract made from biological materials such as bacteria, plants, 
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fungi, or animal (particularly mammalian) cells or tissues. Modulators are typically 
evaluated for potential activity as inhibitors or activators (directly or indirectly) of a 
biological process or processes (e.g., agonist, partial antagonist, partial agonist, 
antagonist, antineoplastic agents, cytotoxic agents, inhibitors of neoplastic transformation 
5 or cell proliferation, cell proliferation-promoting agents, and the like) by inclusion in 
assays described herein. The activity of a modulator may be known, unknow^n or partial 
known. 

The term "test chemical" refers to a chemical to be tested by one or more 
method(s) of the invention as a putative modulator. A test chemical is usually not known 

10 to bind to the target of interest. The term "control test chemical" refers to a chemical 
known to bind to the target (e.g., a known agonist, antagonist, partial agonist or inverse 
agonist). The term "test chemical" does not typically include a chemical added as a 
control condition that alters the fimction of the target to determine signal specificity in an 
assay. Such control chemicals or conditions include chemicals that 1) non-specifically or 

15 substantially disrupt protein structure (e.g., denaturing agents (e.g., urea or guandium), 
charotropic agents, sulfhydryl reagents (e.g., dithiotritol and p-mercaptoethanol), and 
proteases), 2) generally inhibit cell metabolism (e.g., mitochondrial uncouplers) and 3) 
non-specifically disrupt electrostatic or hydrophobic interactions of a protein (e.g., high 
salt concentrations, or detergents at concentrations sufficient to non-specificaUy disrupt 

20 hydrophobic interactions). The term "test chemical" also does not typically include 

chemicals known to be unsuitable for a therapeutic use for a particular indication due to 
toxicity of the subject. Usually, various predetermined concentrations test chemicals are 
used for screening such as .01 |iM, .1 nM, 1.0 \iM, and 10.0 |iM. 

The term "target" refers to a biochemical entity involved a biological process. 

25 Targets are typically proteins that play a useful role in the physiology or biology of an 
organism. A therapeutic chemical binds to target to alter or modulate its function. As 
used herein, targets can include cell surface receptors, G-proteins, kinases, ion channels, 
phopholipases and other proteins mentioned herein. 
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The terms "label" or "labeled" refers to incorporation of a detectable marker, e.g., 
by incorporation of a radiolabeled amino acid or attachment to a polypeptide of biotinyl 
moieties that can be detected by marked avidin (e.g., streptavidin containing a fluorescent 
marker or enzymatic activity that can be detected by optical or colorimetric methods). 
5 Various methods of labeling polypeptides and glycoproteins are known in the art and may 
be used. Examples of labels for polypeptides include, but are not limited to, the 
followmg: radioisotopes (e.g., 'H, "C, ''S, '"I), fluorescent labels (e.g., FITC, 
rhodamine, and lanthanide phosphors), enzymatic labels (or reporter genes) (e.g., 
enzymatic reporter genes horseradish peroxidase, p-galactosidase, luciferase and alkaline 
10 phosphatase; and non-enzymatic reporter genes (e.g., fluorescent proteins)), 

chemiluminescent, biotinyl groups, predetermined polvpcptide epitopes recognized by a 
secondary reporter (e.g., leucine zipper pair sequences, binding sites for secondary 
antibodies, metal binding domains, epitope tags). "Substantially pure" refers to an object 
species is the predominant species present (i.e., on a molar basis it is more abundant than 
15 any other individual species in the composition), and preferably a substantially purified 
fraction is a composition wherein the object species comprises at least about 50 percent 
(on a molar basis) of all macromolecular species present. Generally, a substantially pure 
composition will comprise more than about 80 percent of all macromolecular species 
present in the composition, more preferably more than about 85%, 90%, 95%, and 99%. 
20 Most preferably, the object species is purified to essential homogeneity (contaminant 
species cannot be detected in the composition by conventional detection methods) 
wherein the composition consists essentially of a single macromolecular species. 

"Pharmaceutical agent or drug" refers to a chemical or composition capable of 
inducing a desired therapeutic effect when properly administered (e.g. using the proper 
25 amount and delivery modality) to a patient. 

Other chemistry terms herein are used according to conventional usage in the art, 
as exemplified by The McGraw-Hill Dictionary of Chemical Terms (ed. Parker, S., 
1 985), McGraw-Hill, San Francisco, incorporated herein by reference). 
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Introduction 

The present invention recognizes that P-lactamase polynucleotides can be 
effectively used in living eukaryotic cells to functionally identify active portions of a 
genome directly or indirectly associated with a biological process. The present invention 
also recognizes for the first time that p-Iactamase activity can be measured using 
membrane permeant substrates in living cells incubated with a test chemical that directly 
or indirectly interacts with a portion of the genome having an integrated 
P-lactamase polynucleotide. The present invention, thus, permits the rapid identification 
and isolation of genomic polynucleotides indirectly or directly associated with a defined 
biological process and identification of compounds that modulate such processes and 
regions of the genome. Because the identification of active genomic polynucleotides is 
pemiitted in living cells, fiirther functional characterization can be conducted using the 
same cells, and optionally, the same screening assay. The ability to functionally screen 
immediately after the rapid identification of a functionally active portion of a genome, 
without the necessity of transferring the identified portion of the genome into a secondary 
screening system, represents, among other things, a distinct advantage over an application 
of a prior art reporter gene and methods described herein, as shown in FIG. 1. 

As a non-limiiing introduction to the breadth of the invention, the invention 
includes several general and useful aspects, including: 

1) a method for identifying genes or gene products directly or indirectly 
associated (e.g. regulated) with a biological process of interest (that can be 
modulated by a compound) using a genomic polynucleotide operably linked 
to a polynucleotide encoding a protein with p-lacatamase activity, 

2) a method for identifying proteins (e.g. orphan proteins or known proteins) 
or compounds that directly or indirectly modulate (e.g. activate or inhibit 
transcription) a genomic polynucleotide operably linked to a polynucleotide 
encoding a protein with p-lactamase activity, 

3) a method of screening for an active genomic polynucleotide (e.g, enhancer, 
promoter or coding region in the genome) that can be directly or indirectly 
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associated (e.g. regulated) with a biological process of interest (that can be 
modulated by a compound) using a genomic polynucleotide operably linked 
to a polynucleotide encoding a protein with (i-lactamase activity that can be 
detected by FACS using a fluorescent, membrane permeant P-lactamase 

5 substrate, 

4) eukaryotic cells with a genomic polynucleotide operably linked to a 
polynucleotide encoding a protein with p-lactamase activity, and 

5) polynucleotides related to the above methods and cells. 

These aspects of the invention, as well as others described herein, can be achieved by using the methods 
0 and compositions of matter described herein. To gain a full appreciation of the scope of the invention, it 
will be further recognized that various aspects of the invention can be combined to make desirable 
embodiments of the invention. For example, the invention includes a method of identifying compounds 
that modulate active genomic polynucleotides operably linked to a protein with p-lactamase activity that 
can be detected by FACS using a fluorescent, membrane permeant P-lactamase substrate. Such 
1 5 combinations result in particularly useful and robust embodiments of the invention. 
Methods for Rapidly Identifying Functional Portions of a Genome 

The invention provides for a method of identifying portions of a genome, e.g. 
genomic polynucleotides, in a living cell using a polynucleotide encoding a protein with 
P-lactamase activity that can be detected with a membrane permeant p-lactamase 
20 substrate. Typically, the method involves inserting a polynucleotide encoding a protein 
with P-lactamase activity into the genome of an organism using any method known in the 
art, developed in the future or described herein. Usually, a p-lactamase expression 
construct will be used into integrate a p-lactamase polynucleotide into a eukaryotic 
genome, as described herein. The cell, such as a eukaryotic cell, is usually contacted with 
25 a predetermined concentration of a modulator, either before or after integration of the P- 
lactamase polynucleotide, p-lactamase activity is usually then measured inside the living 
cell, preferably with fluorescent, membrane permeant p-lactamase substrates that are 
transformed by the cell into membrane impermeant p-lactamase substrates as described 
herein and PCT Publication No. WO96/30540 published October 3, 1996, by Tsien et al. 
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Once p-lactamase polynucleotides are integrated into the genome of interest, they 
become under the transciptional control of the genome of the host cell. Integration into 
the genome is usually stable, as described herein and known in the art. Transcriptional 
control of the genome often results from receptor (e.g. intracellular or cell surface 

5 receptor) activation, which can regulate transcriptional and translational events to change 
the amount of protein present in the cell. The amount of protein present with p-lactamase 
activity can be measured via its enzymatic action on a substrate. Normally, the substrate 
is a small uncharged molecule that, when added to the extracellular solution, can 
penetrate the plasma membrane to encounter the enzyme. A charged molecule can also 

10 be employed, but the charges are generally masked by groups that will be cleaved by 
endogenous or heterologous cellular enzymes or processes (e.g., esters cleaved by 
cytoplasmic esterases). As described more fixlly herein and in PCT Publication No, 
WO96/30540 published October 3, 1996, by Tsien et al., which is herein incorporated by 
reference, the use of substrates that exhibit changes in their fluorescence spectra upon 

15 interaction with an enzyme are particularly desirable. In some assays, the fluorogenic 
substrate is converted to a fluorescent product by p-lactamase activity. Alternatively, the 
fluorescent substrate changes fluorescence properties upon conversion by P-lactamase 
activity. Preferably, the product should be very fluorescent to obtam a maximal signal, 
and very polar, to stay trapped inside the cell. 

20 Vectors and Integration 

Vectors, such as viral and plasmid vectors, can be used to introduce genes or 
genetic material of the invention into cells, preferably by integration into the host cell 
genome. Such viral vectors can be any appropriate viruses, such as retroviruses, 
adenoviruses, adeno-associated viruses, papillomaviruses, herpes viruses, or any 

25 ecotropic or amphitropic virus, preferably a retrovirus. The viruses can be, for example, 
retroviruses or any other virus modified to be replicatively deficient, cytomegalovirus, 
Friend leukemia virus, SIV, HIV, Rouse Sarcoma Virus, or Maloney virus such as 
Moloney murine leukemia virus. 
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Vectors, such as retrovirus vectors, can encode an operable selective protein so 
that cells that have been transformed can be positively selected for. Such selective 
protein can be antibiotic resistance factors, such as neomycin resistance, such as NEO. 
Alternatively, cells can be negatively selected for using an enzyme, such as herpes 
5 simplex virus thymidine kinase (HS VTK) that transforms a pro-toxin into a toxin. Viral 
vectors, such as retroviral vectors, are available that are suitable for these purposes, such 
as PSIR vector (available from ClonTech of California with PT67 packaging cells) 
GgU3Hisen and GgTNKneoU3 and GgTKNeoen variants of Moloney murine leukemia 
virus, are available. Vector modifications can be made that allow more efficient 
10 integration into the host cell genome. Such modifications include sequences that enhance 
integration or known methods to promote nucleic acid transportation into the nucleus of 
the host cell. Retro-viral vectors are those described in U.S. Patent Number 5,364,783 by 
Ruley and von Melchner can also be used to increase transfection efficiency. 

Vectors can also be used with Hposomes or other vesicles that can transport 
15 genetic material into a cell. Appropriate structures are known in the art. The liposomes 
can include vectors such as plasmids or yeast artificial chromosomes (YACs), which can 
include genetic material to be introduced into the cell. Plasmids can also be introduced 
into cells by any known methods, such as electroporation, calcium phosphate, or 
lipofection. DNA fragments, without a plasmid or viral vector can also be used. 
20 In one aspect of the present invention, vectors are used to introduce reporter genes 

into cells. When the reporter gene integrates into the genome of a target cell so that the 
reporter gene is expressed, that event can be detected by detecting the reporter gene. 
Clones that express the reporter gene under a wide variety of conditions can be used for a 
variety of purposes, including gene and drug discovery. Chromosomes tagged with^ 
25 lactamase expression constructs can be transferred to desired receipt cells using methods 
established in the art. 

-P-lactamase polynucleotides can be placed on a variety of plasmids for integration 
into a genome and to identify genes from a large variety of organisms (Gorman, CM. et 
al.,Mol. Cell Biol. 2: 1044-1051 (1982); Alam, J. and Cook, J.L., Anal.Biochem. 188: 
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245-254, (1990)). Standard techniques are used to introduce these polynucleotides into a 
cell or whole organism (e.g., as described in Sambrook, J., Fritsch, E.F. and Maniatis, T. 
Expression of cloned genes in cultured mammalian cells. In: Molecular Cloning, edited 
by Nolan, C. New York: Cold Spring Harbor Laboratory Press, 1989). Resistance 
markers can be used to select for successfully transfected cells. 

If a P-lactamase expression construct is selected for integrating a p-lactamase 
polynucleotide into a eukaryotic genome, it will usually contain at least a P-lactamase 
polynucleotide operably linked to a splice acceptor and optionally a splice donor. 
Alternatively, the (J-lactamase polynucleotide may be operably linked to any means for 
integrating a polynucleotide into a genome, preferably for integration into an intron of a 
gene to produce an in frame translation product. The (i-lactamase expression construct 
can optionally comprise, depending on the application, an IRES element, a splice donor, a 
poly A site, translational start site (e.g. a Kozak sequence) an LTR (long terminal repeat) 
and a selectable marker. 
^Lactamase Reporter Genes 

Preferably, P-lactamase polynucleotides encode a cytosolic form of a protein with 
P-lactamase activity. This provides the advantage of trapping the normally secreted P- 
lactamase protein within the cell, which enhances signal to noise ratio of the signal 
associated with p-lactamase activity. Usually, this is accomplished by removing or 
disabling the signal sequence normally present for secretion. As used herein, "cytosolic 
protein with P-lactamase activity " refers to a protein with P-iactamase activity that lacks 
the proper amino acid sequences for secretion from the cell, e.g., the signal sequence. For 
example, in the polypeptide of SEQ. ID NO.: 1, the signal sequence has been replaced 
with the amino acids Met-Ser. Accordingly, upon expression, P-lactamase activity 
remains within the cell. For expression in mammalian cells it is preferably to use P- 
lactamase polynucleotides with nucleotide sequences preferred by mammalian cells. In 
some instances, a secreted form of^lactamase can be used with the methods and 
compositions of the invention. In particular, genes having sequences that direct selection 
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can be identified with a (word)|3-lactamase assay. This also pennits multiplying based on 
directed localization of ^lactamase. 

Proteins with p-lactamase activity can be any known to the art, developed in the 
future or described herein. This includes, for example, the enzymes represented by SEQ. 
5 ID. NO/s described herein. Nucleic acids encoding proteins with P-lactamase activity 
can be obtained by methods known in the art, for example, by polymerase chain reaction 
of cDNA using primers based on the DNA sequence in SEQ. ID. NO.: 1 . PGR methods 
are described in, for example, U.S. Pat. No. 4,683,195; Mullis et. al. (1987) Cold Spring 
Harbor Symp, Quant Biol. 51:263; and Erlich, ed., PCR Technology, (Stockton Press, 

10 NY, 1989). 

Sequences for Assisting Integration 

The p-lactamase expression construct typically includes sequences for integration, 
especially sequences designed to target or enhance integration into the genome. 
The splice site acceptor can be operably linked to the p-lactamase polynucleotide to 

15 facilitate expression upon integration into an intron. Usually, a fusion RNA be 
created with the coding region of an adjacent operably portion of the exon. A splice 
acceptor sequence is a sequence at the 3' end of an intron where it junctions with an exon. 
The consensus sequences for a splice acceptor is NTN (TC) (TC) (TC) TTT (TC) 
(TC)(TC) (TC) (TC) (TC) NCAGgt. The intronic sequences are represented by upper 

20 case and the exonic sequence by lower case font. These sequences represent those that 
are conserved from viral to primate genomes. 

The splice donor site can be operably linked to the P-lactamase polynucleotide to 
facilitate integration in an intron to promote expression by requiring a poly-adenylation 
sequence. Usually, a fusion RNA is created with the coding region or untranslated on the 

25 3' end of the (J-lactamase polynucleotide. This is preferred when it is desired to sequence 
the coding region of the identified gene. A splice donor is a sequence at the 5' end of an 
intron where it junctions with an exon. The consensus sequence for a splice donor 
sequence is naggt(ag)aGT. The intronic sequences are represented by upper case and the 
exonic sequence by lower case font. These sequences represent those that are conserved 
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from viral to primate genomes. This splice donor allows identification of the target gene 
using 3' RACE. 

As an alternative to a splice donor site, a poly A site may be operably linked to 
the p-lactamase polynucleotide. Poly-ad enylation signals, i.e poly A sites, include SV40 
poly A sites, such as those described in the Invitrogen Catalog 1996 (California). 
In some instances, it may be desirable to include in the p-lactamase expression construct 
a translational start site. For instance, a translational start site allows for P-lactamase 
expression even if the integration occurs in non-coding regions. Usually, such sequences 
will not reduce the expression of a highly expressed gene. Translational start sites 
include a "Kozak sequence" and are the preferred sequences for expression in mammalian 
cells described in Kozak, M., J. Cell Biol. 108: 229-241 (1989). The nucleotide sequence 
for a cytosolic protein with p-lactamase activity in SEQ. ID. NO.: 3 contains a Kozak 
sequences for the nucleotides -9 to 4 (GGTACCACCATGA), 

It is also preferable, when using mammalian cells, to include an IRES (''internal 
ribosome entry binding site") element in the P-lactamase expression construct. Typically, 
an IRES element will improve the yield of expressing clones. One caveat of integration 
vectors is that only one in three insertions into an intron will be in frame and produce a 
functional reporter protein. This limitation can be reduced by cloning an IRES sequence 
between the spHce acceptor site and the reporter gene (e.g., a p-lactamase 
polynucleotide). This eliminates reading frame restrictions and possible functional 
inactivation of the reporter protein by fusion to an endogenous protein. IRES elements 
include those from piconaviruses, picoma-related viruses, and hepatitis A and C. 
Preferably, the IRES element is from a poliovirus. Specific IRES elements can be found, 
for instance, in W0961 121 1 by Das and Coward published 4/16/96, EP 585983 by Zurr 
published 3/7/96, WO9601324 by Berlioz published I/l 8/96 and WO9424301 by Smith 
published October 27, 1994, all of which are herein incorporated by reference. 
To improve selection of P-lactamase polynucleotide into a genome, a selectable marker 
can be used in the p-lactamase expression construct. Selectable markers for mammalian 
cells are known in the art, and include for example, thymidine kinase, dihydrofolate 
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reductase (together with methotrexate as a DHFR amplifier), aminoglycoside 
phosphotransferase, hygromycin B phosphotransferase, asparagine synthetase, adenosine 
deaminase, metallothionien, and antibiotic resistant genes such as genes neomycin 
resistance. Selectable markers for non-mammalian cells are known in the art and include 

5 genes providing resistance to antibiotics, such as kanamycin, tetracycline, and ampicillin. 
The invention can be readily practiced with genomes having intron/exon 
structures. Such genomes include those of mammals (e.g., human, rabbit, mouse, rat, 
monkey, pig and cow), vertebrates, insects and yeast. In tron -targeted vectors are more 
commonly used in manunalian cells as introns, or intervening sequences, are 

10 considerably larger than exons, or mRNA coding regions in mammals. Intron targeting 
can be achieved by cloning a splice acceptor or 3' intronic sequences upstream of a P- 
lactamase polynucleotide gene followed by a polyadenylation signal or 5' intronic splice 
donor site. When the vector inserts into an intron, the reporter gene (e.g., ^-lactamase) is 
expressed under the same control as the gene into which it has inserted. 

15 The invention can also be practiced with genomes having reduced numbers of, or 

lacking, intron/exon structures. For lower eukaryotes, which have simple genomic 
organization, i.e. containing few and small introns, exon-targeted vectors can be used. 
Such vectors include p-lactamase polynucleotides operably linked to a poly-adenylation 
sequence and optionally to an IRES element. Lower eukaryotes include yeast, and fiingi 

20 and pathogenic eurokaryotes (e.g. parasites and microoganisms). For genomes lacking 
intron/exon structures restriction enzyme integration, transposon induced integration or 
selection integration can be used for genomic integration. Such methods include those 
described by Kuspa and Loomis, PNAS 89: 8803-8807 (1992) and Derbyshire, K.M., 
Gene Nov. 7: 143-144 (1995). Prokaryotes can be used with the invention if integration 

25 can occur in such genomes. Retroviral vectors can also be used to integrate p-lactamase 
polynucleotides into a genome (e.g., eukaryotic), such as those methods and composition 
described in U.S. Patent Number 5,364,783. 

Typically, integration will occur in the regions of the genome that are accessible 
to the integration vector. Such regions are usually active portions of the genome where 
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there is increased genome regulatory activity, e.g. increased polymerase activity or a 
change in DNA binding by proteins that regulate transcription of the genome. Many 
embodiments of the invention described herein can result in random integration, 
especially in actively transcribed regions. 

5 Integration into Active Portions of the Genome 

Integration, however, can be directed to regions of the genome active during 
specific types of genome activity. For instance, integration at sites in the genome that are 
active during specific phases of the cell cycle can be promoted by synchronizing the cells 
in a desired phase of the cell cycle. Such cell cycle methods include those known in the 

10 art, such as serum deprivation or alpha factors (for yeasl). Integration may also be 

directed to regions of the genome active during cell regulation by a chemical, such as an 
antagonist or agonist for a receptor or some other chemical that increase or decreases or 
otherwise modulates genome activity. By adding the chemical of interest, genome 
activity can be increased, often in specific regions to promote integration of an integration 

15 vector (e.g. as a reporter gene construct), including those of the invention, into such 
regions of the genome. 

For instance, a nuclear receptor activator (general or specific) could applied to 
activate the cells prior or during integration in order to promote integration of reporter 
genes at sites in the genome that become more active during nuclear receptor activation. 

20 Such cells could then be screened with the same or different nuclear receptor activator to 
identify which clones, and which portions of the genome are active during nuclear 
receptor activation. Any agonists, antagonists and modulators of the receptors described 
herein can be used in such a manner, as well as any other chemicals that increase or 
decrease genome activity. 

25 Cells for Integration into the Genome 

The cells used in the invention will typically correspond to the genome of interest. 
For example, if regions of the human genome are desired to be identified, then human 
cells containing a proper genetic complement will generally be used. Libraries, however, 
could be biased by using cells that contain extra-copies of certain chromosomes or other 


SDOCID: <WO 9813353A1_I_> 


wo 98/13353 PCT/US97/17395 

portions of the genome. Cells that do not correspond to the genome of interest can also 
be used if the genome of interest or significant portions of the genome of interest can be 
replicated in the cells, such as making a human-mouse hybrid. 

Additionally, by the appropriate choice of cells and expressed proteins, 

5 identification and screening assays can be constructed that detect active portions of the 
genome associated with a biological process that requires, in whole or part, the presence 
of a particular protein (protein of interest). Cells can be selected depending on the type of 
proteins that are expressed (homologously or heterologously) or from the type of tissue 
from which the cell line or explant was originally generated. If the identification of 

10 portions of the genome activated by a particular type of protein is desired, then the cell 
used should express that protein. 

The cells can express the protein homologously, i.e. expression of the desired 
protein normally or naturally occurs in the cells. Alternatively, the cells can be directed 
to express a protein heterologously, i.e. expression of the desired protein which does not 

15 normally or naturally occur in the cells. Such heterologous expression can be directed by 
"turning on" the gene in the cell encoding the desired protein or by transfecting the cell 
with a polynucleotide encoding the desired protein (either by constitutive expression or 
inducible expression). Inducible expression is preferred if it is thought that the expressed 
protein of interest may be toxic to the cells. 

20 Many cells can be used with the invention. Such cells include, but are not 

limited to adult, fetal, or embryonic cells. These cells can be derived from the 
mesoderm, ectoderm, or endoderm and can be stem cells, such as embryonic or adult 
stem cells, or adult precursor cells. The cells can be of any lineage, such as vascalar, 
neural, cardiac, fibroblasts, lymphocytes, hepatocytes, cardiac, hematopoeitic, 

25 pancreatic, epidermal, myoblasts, or myocytes. Other cells include baby hamster kidney 
(BHK) cells (ATCC No. CCLIO), mouse L cells (ATCC No. CCLL3), Jurkats (ATCC 
No. TIB 152) and 153 DG44 cells (see, Chasin (1986) Cell. Molec. Genet. 12: 555) 
human embryonic kidney (HEK) cells (ATCC No. CRL1573), Chinese hamster ovary 
(CHO) cells (ATCC Nos. CRL9618, CCL61, CRL9096), PC 12 cells (ATCC No. 
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CRL17.21) and COS-7 cells (ATCC No. CRL1651). Preferred cells include Jurkat cells, 
CHO cells, neuroblastoma cells, P19 cells, Fl 1 cells, NT-2 ceils, and HEK 293 cells, 
such as those described in U.S. Patent No. 5,024,939 and by Stillman et al. Mol. Cell. 
Biol. 5: 205 1 -2060 (1 985). Preferred cells for heterologous protein expression are those 
that can be readily and efficiently transfected. 

Cells used in the present invention can be from continuous cell lines or primary 
cell lines obtained from, for example, mammalian tissues, organs, or fluids. Tissue 
sections as well as disperse cells can be used in the present invention. Cells can also be 
obtained from transgenic animals that have been engineered to express a reporter gene. 
Cells obtained from transgenic or non-transgenic animals are preferred for cells that are 
difficult to culture in vitro, such as neural and hepatic cells. Primary cell lines can be 
made continuous using known methods, such as fusing primary cells with a continuous 
cell line or expressing transforming proteins. Cells of the invention can be stored or used 
with methods of the invention as isolated, clonal populations in plates such as those 
described in commonly owned United States Patent Applications having Attorney Docket 
Nos: 08366/010001, entitled "Low background multi-well plates and platforms for 
spectroscopic measurements" (Coassin et aL, filed June 2, 1997); and 08366/009001, 
entitled "Low background multi-well plates with greater than 864 wells for spectroscopic 
measurements*' (Coassin et aL, filed June 2, 1997); each of which is incorporated herein 
by reference. Preferably, cells are stored or used in plates with 96, 384, 1536 or 3456 
wells per plate. A single cell or a plurality of cells can be placed in such wells. Such 
isolated clonal populations will typically have 1,000, 10,000, or 100,000 or more such 
populations representative of substantially equivalent numbers of independent 
integrations sites. Such panels can be used in profiling, pathway identification, 
modulator identification, modulator characterization, and other methods of the invention. 

Prior to being transfected with a trapping vector of the present invention, cells can 
be transfected with an exogenous gene capable of expressing an exogenous protein, such 
as a receptor (e.g., GPCR) or gene associated with the pathology of an etiological agent, 
such as a virus, bacteria, or parasite. Cells that express such exogenous proteins can then 
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be transfected with a trapping vector to form a library of clones that can be screened 
using the present invention. The invention can also include animals with p-lacatmase 
expression constructs integrated into the genome of interest. 

Many of the cells of the present invention can report modulation of biological 
5 processes by a variety of additional reporter genes or chemicals or combinations thereof. 
For example, beta-lactamase, an enzyme, can convert non-chromogenic substrates to 
chormogenic products or alter the chromogenic or fluorescent properties of a substrate 
such as CCF2. Furthermore, fluorescent reporters, such as fluorescent proteins, such as 
green fluorescent protein (GFP) molecules, can be used as reporters. Some mutant GFP 
10 molecules have different fluorescent properties as compared to wild-type GFP. These 
GFPs can be used as reporters and can be used singly or in combination with the present 
invention. For example, cells can have multiple reporters that can be differentiated to 
report different biological processes, or different steps within a biological process, such 
as steps in a signal transduction pathway. 
15 Targets 

Proteins of interest that can be expressed in the cells of the invention include,: 
hormone receptors (e.g. mineralcorticosteroid, gluococorticoid, and thyroid hormone 
receptors); intracellular receptors (e.g., orphans, retinoids, vitamin D3 and vitamin A 
receptors); signahng molecules (e.g., kinases, transcription factors, or molecules such 

20 signal transducers and activators of transcription) {Science Vol. 264, 1994, p. 141 5-1421; 
Mol. Cell BioL, Vol. 16, 1996, p,369-375); receptors of the cytokine superfamily (e.g. 
erthyropoietin, growth hormone, interferons, and interleukins (other than lL-8) and 
colony-stimulating factors); G-protein coupled receptors, see US patent 5,436,128 (e.g., 
for hormones, calcitonin, epinephrine, gastrin, and pancrine or autocrine mediators, such 

25 as stomatostatin or prostaglandins) and neurotransmitter receptors (norepinephrine, 

dopamine, serotonin or acetylcholine); tyrosine kinase receptors (such as insulin growth 
factor, nerve growth factor (US patent 5,436,128)). Examples of the use of such proteins 
is further described herein. 
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Any target, such as an intracellular or extracellular receptor involved in a signal 
transduction pathv/ay, such as the leptin or GPCR pathways, can be used w^ith the present 
invention. Furthermore, the genes activated or repressed by a target can be isolated, 
identified, and modulators of that gene identified using the present invention. For 
5 example, the present invention can identify a G-protein coupled receptor (GPCR) 

pathway, determine its function, isolated the genes modulated by the GPCR, and identify 
modulators of such GPCR modulated proteins. 

As an introduction to GPCR cell biology, the activation of Ga,5 or Ga,^ can, 
through a G-protein signaling pathway, activate PLCP, which in turn increases 

10 intracellular calcium levels. An increase in calcium levels can lead to modulation of a 
*'calcium-responsive" promoter that is part of a signal transduction detection system, i.e., 
a promoter that is activated (e.g., a NFAT promoter AP-1 ) or inhibited by a change in 
calcium levels. One example of an NFAT DNA binding site is described in Shaw, et al. 
Science 29 1 :202-205 ( 1 988). Likewise, a promoter that is responsive to changes in 

15 protein kinase C levels (e.g., a "protein kinase C-responsive promoter") can be modulated 
by an active Ga protein through G-protein signaling pathway. Selected cells described 
herein can also include a G-protein coupled receptor. Genes encoding numerous GPCRs 
have been cloned (Simon et al.. Science 252:802-808 (1991)), and conventional 
molecular biology techniques can be used to express a GPCR on the surface of a cell of 

20 the invention. Preferably, the sum responsive promoter can allow for only a relatively 
short lag (e.g., less than 90 minutes) between engagement of the GPCR and 
transcriptional activation. A preferred responsive promoter includes the nuclear factor of 
activated T-cell promoter (Flanagan et al, Nature 352:803-807 (1991)). Polynucleotides 
identified by methos of the invention can be used as response elements that are sensitive 

25 to intracellular signals (signal-response elements). Signal response elements can be used 
in the assays described herein, such as identification of useful chemicals. Such signal 
response elements may sensitive intracellular signals that include voltage, pH, and 
intracellular levels of Ca^\ ATP, ADP, cAMP, GTD, GDP, K*, Na+, Zn+H-, oxygen, 
metabolites and IP3. 
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In one aspect of the present invention, cells can be transformed to express an 
exogenous receptor, such as GPCR. Such a transduced cell line can than be futher 
transduced with a trapping vector to make a library of clones that can be used to identify 
cells that report modulation of the exogenous receptor. Preferably, the host cell line 
would not appreciably express the exogenous receptor. 

Based on the unique structure of GPCRs, which have seven hydrophobic, 
presumably trans-membrane, domains (see, Watson and Arkinstall, The G-Protein Linked 
Receptor Facts Book , Academic Press, New York (1994)) orphan GPCRs (GPCRs 
having no knovm function) can be identified by searching sequence databases, such as 
those provided by the National Library of Medicine (Bethesda, MD), for similar motifs 
and homologies. This same strategy can, of course, be used for any target, especially 
when a paradigm sequence or motif has been determined. 
Drug Discovery for Viruses and Other Pathogens 

The function of genes from viruses or other pathogens that effect the expression 
of genes in cells, such as mammalian cells, can be determined using the present 
invention. Furthermore, chemicals that modulate these genes can be identified using the 
methods of the present invention. For example, many transforming viruses, after infecting 
a cell, have the effect of up-regulating genes involved in cell proliferation, which allows 
the virus-infected cells to produce additional viruses, which can infect additional cells. 
These transforming viruses can act by stimulating a receptor from the target cell. 
One example of the mechanism is the Friend Erythro leukemia virus. This virus uses the 
erythropoetin receptor for entry into the cells. When the virus is bound to the receptor, a 
pathway is activated that causes an over-prohferation of red blood cells. If the activation 
of the erythropoetin receptor is inhibited, a decrease in the accumulation of red blood 
cells would result which can prevent or reduce the severity of the leukemia. The 
development of an assay that reports the activation of manmialian target genes allows the 
identification of modulators of other viral or pathogenic dependent pathways. These 
modulators can be used as therapeutic agents. 
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A general procedure for establishing this assay uses the virus or an isolated viral 
protein as the stimulus for modulating a pathway. First, a gene-trapping library is made 
using a cell line that can be infected by the virus or activated by the viral protein. The 
virus is added to these cells, and clones are isolated that responded specifically to the 
5 viral infection by the expression of a reporter gene. 

As an example, the GP-120 portion of HIV protein is known to have mitogenic 
effect on cells exposed to GP-120, which indicates that downstream signaling pathways 
are being activated that can be associated with the cytotoxicity of the virus and allow its 
proliferation. Cell clones can be isolated that are induced by this activation which can be 

10 used to screen for modulators of this cytotoxic or proliferative effect. Other viral 

proteins, such as NEF from HIV, can be used. Chemicals that inhibit this effect can have 
useful therapeutic value to treat viral infection or toxicity. 

This approach can be applied to any cellular pathogen that has an effect on a 
target cells, such as cytotoxicity, cell proliferation, inflammation or other responses. 

15 Other etiological targets include other viruses, such as retroviruses, adenovirus, 

papillomavirus, herpesviruses, cytomegalovirus, adeno associated viruses, hepatitis 
viruses, and any other virus. In addition to viruses, any other pathogen, such as parasites, 
bacteria, and viroids, can be used in the present invention. Particular viral targets include, 
but are not limited to, NEF, Hepatitis X protein, and other viral proteins, such as those 

20 that can be encoded or carried by a virus. In addition, two or more viral components can 
be added to identify coviral pathogensis components. This is a particularly valuable tool 
for identifying pathways modulated by two or more viruses concurrently, or over tinle as 
in slow activating viral conditions. For example, cotransfection with HIV and CMV may 
be used. Viral targets or components do not include oncogenes or proto-oncogenes found 

25 in uninfected genomes, and gene products thereof. 

Screening Test Chemicals Using Portions Of The Genome 

Cells comprising p-lactamase polynucleotides integrated in the genome can be 
contacted with test chemicals or modulators of a biological process and screened for 
activity. Usually, the test chemical being screened will have at least one defined target. 
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usually a protein. The lest chemical is normally applied to the cells to achieve a final 
predetermined concentration in the medium bathing the cells. Typically, screens are 
conducted at concentrations 100 uM or less, preferably 10 uM or less and preferably 1 
uM or less for confirmatory screens. As described more fully herein, cells can be 
5 subjected to multiple rounds of screening and selection using the same chemical in each 
round to insure the identification of clones with the desired response to a chemical or 
with different chemicals to characterize which chemicals produce a response (either an 
increase or decrease |3-lactamase activity) in the cells. Such methods can be applied to 
any chemical that alters the function of any the proteins mentioned herein or known in the 
10 art. 

Chemicals and physiological processes without a defined target, however, can 
also be used and screened with the cells of the invention. For example, once a clone is 
identified as containing an active genomic polynucleotide that is activated by a particular 
cellular signal (including extracellular signals), for instance by a neurotransmitter, that 

15 same clone can be screened with chemicals lacking a defined target to determine if 
activation by the neurotransmitter is blocked or enhanced by the chemical. This is a 
particularly useful method for finding therapeutic targets downstream of receptor 
activation (in this case a neurotransmitter). Such methods can be applied to any chemical 
that alters the function of any the proteins mentioned herein or known in the art. This 

20 type of "targetless" assay is particular useful as a screening tool for the medial conditions 
and pathways described herein. 

The methods and compositions described herein offer a number of advantages 
over the prior art. For instance, screening of mammalian based gene integration libraries 
is limited by the use of existing reporter systems. Many enzymatic reporter genes, such 

25 as secreted-alkaline phosphatase, and luciferase, cannot be used to assay single living 
cells (including FACS) because the assay requires cell lysis to determine reporter gene 
activity. Alternatively, P-galactosidase can detect expression in single cells but substrate 
loading requires permeabilization of cells, which can cause deleterious effects on normal 
cell functions. Additionally, the properties of fluorescent P-galactosidase substrates, such 
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as fluoroscein di-P-D-galaclopyranside, and products make it very difficult to screen 
large libraries for both expressing and non-expressing cells because the substrate and 
product is not well retained or permits ratiometric analysis to determine the amount 
uncleaved substrate. Green fluorescent protein (GFP), a non-enzymatic reporter, could be 
5 used to detect expression in single living cells but has limited sensitivity. GFP 

expression level would have to be at least 100,000 molecules per cell to be detectable in a 
screening format and small changes in, or low levels of, gene expression could not be 
measured. Furthermore GFP is relatively stable and would not be suitable for measuring 
down-regulation of genes. Other advantages of the invention are described herein or 
10 ' readily recognized by one skilled in the art upon reviewing this disclosure. 

Methods for Rapidly Identifying Modulators of Genomic Polynucleotides 

The invention provides for a method of identifying proteins or chemicals that 
directly or indirectly modulate a genomic polynucleotide. Generally, the method 

15 comprises inserting a P-lactamase expression construct into an eukaryotic genome, 
usually non-yeast, contained in at least one living cell, contacting the cell with a 
predetermined concentration of a modulator, and detecting P-lactamase activity in the 
cell. Preferably, cleavage of a membrane permeant P-lactamase substrate is measured 
and the membrane permeant p-lactamase substrate is transformed in the cell into a 

20 trapped substrate. Preferably, the P-lactamase expression construct comprises a p- 

lactamase polynucleotide, a splice donor, a splice acceptor and an IRES element. The 
method can also include determining the coding nucleic acid sequence of a 
polynucleotide operably linked to the p-lactamase expression construct using techniques 
known in the art, such as RACE. 

25 Modulator Identification 

Modulators described herein can be used in this system to test for an increase or 
decrease in p-lactamase activity in successfiilly integrated clones. Such cells can 
optionally include specific proteins of interest as discussed herein. For example, the cell 
can include a protein or receptor that is known to bind the modulator (e.g., a nuclear 
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receptor or receptor having a transmembrane domain heterologously or homologously 
expressed by the cell). A second modulator can be added either simultaneously or 
sequentially to the cell or cells and P-lactamase activity can be measured before, during 
or after such additions. Cells can be separated on the basis of their response to the 
modulator (e.g. responsive or non-responsive) and can be characterized with a number of 
different modulators to create a profile of cell activation or inhibition. 

P-lactamase activity will often be measured in relation to a reference sample, 
often a control. For example, p-lactamase activity is measured in the presence of the 
modulator and compared to the p-lactamase activity in the absence of the modulator or 
possibly a second modulator. Alternatively, p-lactamase activity is measured from a cell 
expressing a protein of interest and to a cell not expressing the protein of interest (usually 
the same cell type). For instance, a modulator may be known to bind to a receptor 
expressed by the cell and the p-lactamase activity in the cell is increased in the presence 
of the modulator compared to the p-lactamase activity detected from a corresponding cell 
in the presence of the modulator, wherein the corresponding cell does not express the 
receptor. 

Pathway Identification and Modulators 

When a reporter gene of the invention integrates into the genome of a host cell 
such that the reporter gene is expressed under a variety of circumstances, these clones can 
be used for drug discovery and functional genomics. These clones report the modulation 
of the reporter gene in response to a variety of stimuli, such as hormones and other 
physiological signals. These stimuli can be involved in a variety of known or unknown 
pathways that are modulated by known or unknown modulators or targets. Thus, these 
clones can be used as a tool to discover chemicals that modulate a particular pathway or 
to determine a cellular pathway. 

These pathways are quite varied, and fall into general classes, which have specific 
species, which can be modulated by known or unknown modulators or agonists or 
antagonists thereof. By way of example. Table 1 illustrates various pathways, species of 
these classes, and known modulators of these species. The invention can be used to 
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identify regions of the genome that are modulated by such pathways, or physiological 
event 


TABLE 1 
Pathways and modulators 


Pathway/Physiological Event 


Genus 

Species 

Known Modulator 

Nuclear receptors 

Estrogen receptor 

Estrogen 

Cytokines 

IL-2 receptor 

IL-2 

GPCRs 

Vasopressin receptor 

Vasopressin 

Transcription factors 

Fos or Jun 

NFAT 

Kinase dependent 

Protein kinase C 

PMA 

Phosphatase dependent 

Calcineurine 

Cyclosporin A 

Protease dependent 

Metalloprotease 

TIMPs 

Chemokine 

CCRl 

RANTES 

Ion channels 

Calcium channels 

Many known blockers 

Second messenger 
dependent 

Cyclic AMP 

CAMP inhibitor protein 

Cell differentiation 

Hematopoeitic 
development 

EPO 

Cell growth 

IL-2 receptor 

IL-2 

Cell cycle dependent 

CDK 

P21 

Apoptosis 

Fas 

P53 


In one embodiment, the invention provides for a genomic assay system to identify 
10 downstream transcriptional targets for signaling pathways. This method requires the 
target of interest to activate gene expression upon addition of chemical or expression of 
the target protein. A cell line that is the most similar to the tissue type where the target 
functions is preferred for generating a library of clones with different integration sites 
with p-lactamase polynucleotides or other reporter genes. This cell line may be known to 
15 elicit a cellular response, such as differentiation upon addition of a particular modulator. 
If this type of cell line is available, it is preferred for screening, as it represents the native 
context of the target. If a cell line is not available that homologously expresses the target; 
a cell line can be generated by heterologously expressing the target in the most relevant 
cell line. For instance, if the target is nonmally expressed in the lymphoid cells, then a 
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lymphoid cell line would be used generate the library. 

The library of clones, as described further herein, can be separated into two pools 
by FACS using the FRET system described herein: an expressing pool (e.g. blue cells) 
and a non-expressing pool (e.g. green cells). These two pools can then be treated with a 

5 modulator followed by FACS to isolate induced clones (e.g. green to blue) or repressed 
clones (e.g. blue to green). Additional rounds of stimulation followed by FACS can be 
performed to verify initial results. The specificity of activation can be tested by adding 
additional chemicals that would not activate the defined target. This would allow the 
identification of clones that have P-lactamase polynucleotides integrated into genes 

10 activated by a variety of cellular signals. 

Once a pool of cells with the desired characteristics are isolated they can be 
expanded and their corresponding genes cloned and characterized. Targets which could 
be used in this assay system include receptors, kinases, protein/protein interactions or 
transcription factors and other proteins of interest discussed herein. 

15 In another embodiment, the invention provides for a method of identifying 

developmentally or tissue specific expressed genes, (i-lactamase polynucleotide can be 
inserted, usually randomly, into any precursor cell such as an embryonic or hematopoetic 
stem cell to create a library of clones. Constitutively expressing clones can be collected 
by sorting for blue cells and non-expressing cells collected by sorting for green cells 

20 using the FRET system described herein. The library of clones can then be stimulated or 
allowed to differentiate, and induced or repressed clones isolated. Cell surface markers in 
conjunction with fluorescent tagged antibodies or other detector molecules could be used 
to monitor the expression of reference genes simultaneously. Additionally, by 
stimulation and sorting stem cells at various developmental stages, it is possible rapidly 

25 identify genes responsible for maturation and differentiation of particular tissues. 

Additionally, clones that have a P-lactamase polynucleotide integrated, either 
randomly or by homolgous recombination, into developmentally expressed genes can be 
used with FACS to isolate specific cell populations for further study, such as screening. 
Such methods can be used for identifying cell populations that have stem cells properties, 
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as well as providing an intracellular reporter that allows isolation and screening of such a 
population of cells. 

The present invention can yield screening cell lines for a variety of targets whose 
downstream signaling elements are already known or postulated. These screening cell 

5 lines can be used to either screen for modulators of transfected targets or as readouts for 
expression cloning or functional analysis of uncharacterized targets. Screening cell lines 
can be made for any pathway or any modulator, such those described in Table 1. 

In the case of ion channels, cell lines are generated in which p-lactamase 
expression is used to detect a voltage change. This is possible because intracellular 

10 signaling is sensitive to membrane potential and will modulate the expression of a subset 
of genes. In one example, a Hbrary of neuronal cells prepared following the general 
methods set forth in Examples 1 to 13, such as a dorsal root neuroblastoma cells, be 
screened for a response to a depolarization by incubating cells in high potassium (high 
K") medium. Depending on the particular characteristics of the cell library and the 

15 method used, clones with a transcriptional response to a depolarizing treatment are 

identified by sorting for cells which changed from either green to blue or blue to green 
after depolarization. These clones are designated as voltage-sensitive clones and can be 
used as screening cell lines to identify chemicals that modulate ion channels (either 
endogenously expressed or transfected) which cause a voltage change upon either 

20 activation or inhibition (e.g. or Na"" channels). These cells are also usefiil for 

expression cloning of ion channels. For example, a voltage-sensitive clone could be 
transfected with a cDNA library. Those cells transfected with functional channels that 
shift the membrane potential are detected via beta-lactamase and the cDNA gene products 
are analyzed for activity as ion channels, 

25 Furthermore, a gene encoding a known ion channel can be transfected into the 

voltage sensitive cell line and then used as a screen for channel modulators. For example, 
expression or pharmacological activation of a Na* channel can cause a depolarization that 
can be reported by the cell line. This cell line can be used to screen for agonists or 
antagonists, depending on the experimental protocol of ion channel modulators. In a 
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variation of this approach, a genomic library from a cell line lacking channels, such as 
L929 cells, can be directly transfected with a K"^ channel gene. The expression of the K* 
channel causes a voltage shift, such as a hyperpolarization, causing a change in 
expression of certain voltage-sensitive genes. The clones expressing these genes can be 
used to screen for regulators of the ion channel. 

Orphan protein signaling pathway identification and orphan protein modulators 

In another embodiment, the invention provides for a method of identifying 
modulators of orphan proteins or genomic polynucleotides that are directly or indirectly 
modulated by an orphan protein. Human disease genes are often identified and found to 
show little or no sequence homology to functionally characterized genes. Such genes are 
often of unknown function and thus encode for an "orphan protein." Usually such orphan 
proteins share less than 25% amino acid sequence homology with other known proteins 
or are not considered part of a gene family. With such molecules there is usually no 
therapeutic starting point. By using libraries of the herein described clones, one can 
extract functional information about these novel genes. 

Orphan proteins can be expressed, preferably overexpressed, in living mammalian 
cells. By inducing over expression of the orphan gene and monitoring the effect on 
specific clones one may identify genes that are transcriptionally regulated by the orphan 
protein. By identifying genes whose expression is influenced by the novel disease gene 
or other orphan protein one may predict the physiological bases of the disease or function 
of the orphan molecule. Insights gained using this method can lead to identification of a 
valid therapeutic target for disease intervention. 

Modulator Identification using Genomic Polynucleotides Activated by Cellular Signals 

In another embodiment, the invention provides for a method of screening a 
defined target or modulator using genomic polynucleotides identified with the methods 
described herein. The gene identification methods described herein can also be used in 
conjunction with a screening system for any target that functions (either naturally or 
artificially) through transcriptional regulation. 

In many instances a receptor and its ligand are known but not the downstream 
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biological processes required for signaling. For example, a cytokine receptor and 
cytokine may be known but the downstream signaling mechanism is not. A library of 
clones generated from a cell line that expresses the cytokine receptor can be screened to 
identify clones showing changes in gene expression when stimulated by the cytokine, 
5 The induced genes could be characterized to describe the signaling pathway. Using the 
methods of the invention, gene characterization is not required for screen development, as 
identification of a cell clone that specifically responds to the cytokine constitutes a usable 
secondary screen. Therefore, clones that show activation or deactivation upon the 
addition of the cytokine can be expanded and used to screen for agonists or antagonists of 

10 cytokine receptor. The advantage of this type of screening is that it does not require an 
initial understanding of the signaling pathway and is therefore uniquely capable of 
identifying leads for novel pathways. 

In another embodiment, the invention provides for a method of functionally 
characterizing a target using a panel of clones having active genomic polynucleotides as 

15 identified herein. As large numbers of specifically responding cell lines containing active 
genomic polynucleotides identified with a particular biological process or modulator are 
generated, panels containing specific clones can be used for functional analysis of other 
potential cellular modulators. These panels of responding ceil lines can be used to 
rapidly profile potential transcriptional regulators. Such panels, as well as containing 

20 clones with identified active genomic polynucleotides, which were generated by the 

invention panels, can include clones generated by more traditional methods. Clones can 
be generated that contain both the identified active genomic polynucleotide with a p- 
lactamase polynucleotide and specific response elements, such as SRE, CRE, NFAT, 
TRE, IRE, or reporters under the control of specific promoters. These panels would 

25 therefore allow the rapid analysis of potential effectors and their mechanisms of cellular 
activation. A second reporter (e.g.^galactosidase gene can also be used with this 
method, as well as the other method described herein. 

In another embodiment, the invention provides for a method of test chemical 
profiling using a clone or panel of clones having identified active polynucleotides. Test 
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chemical characterization is similar to target characterization except that the cellular 
target(s) do not have to be known. This method will therefore allow the analysis of test 
chemical (e.g. lead drugs) effects on cellular function by defining genes effected by the 
drug or drug lead. 

5 Such a method can find application in the area of drug discovery and secondary 

affects (e.g. cytotoxic affect) of drugs. The potential drug would be added to a library of 
genomic clones and clones which either were induced or repressed would be isolated, or 
identified. This method is analogous to target characterization except that the secondary 
drug target is unknown. As well as providing a screen for the secondary effects, the assay 

10 provides information on the mechanism of toxicity. 

Methods Related to FACS and Identifying Active Genomic Polynucleotides 
The invention provides for a method of identifying active genomic 
polynucleotides using clones having integrated P-lactamase polynucleotides and FACS. 
P-Iactamase integration libraries can be used in a high-throughput screening format, such 

15 as FACS, to detect transcriptional regulation. The compatibility of P-lactamase assays 

with FACS enables a systematic method for defining patterns of transcriptional regulation 
mediated by a range of factors. This approach has not been feasible or practical using 
existing reporter systems. This new method will allow rapid identification of genes 
responding to a variety of signals, including tissue specific expression and during pattern 

20 formation. 

For example, after integration of a P-lactamase polynucleotide, expressing and 
non-expressing cells can be separated by FACS. These two cell populations can be 
treated with potential modulators and changes in gene expression can be monitored using 
ratio-metric fluorescent readout. Pools of clones will be isolated that show either up- or 
25 down-regulation of reporter gene expression. Target genes from responding clones can 
then be identified. In addition, by being able to separate expressing and non-expressing 
cells at different time points after modulator addition, genes which are differentially 
regulated over time can be identified. This approach therefore enables the elucidation of 
transcription cascades mediated by cellular signaling. Specifically, it will provide a 
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means to identify downstreann genes which are transcriptionally regulated by a vanety of 
molecules including, nuclear receptors, cytokine receptors or transcription factors. 

Applications of this technology are nearly unlimited in the areas of gene discovery 
and functional analysis. Libraries of cell lines from various tissue types could be 

5 generated and used to identify genes with specific expression patterns or regulation 
mechanisms. These libraries of clones would represent millions of integration sites 
saturating the genome and can permit the identification of any expressed gene based on 
its transcriptional regulation. The features of the p-lactamase reporter system, in part, 
allow its use for this genomic integration assay in a high-throughput format 

10 There are a variety of other approaches that may be used with the invention, 

including approaches similar to those proposed for P-lactamase. Examples would include 
antibody epitopes presented on the cell surface with fluorescent antibodies to detect 
positive cells. Gel matrixes could also be used which retain secreted reporters and allow 
detection of positive cells. These approaches would, however, be limited in sensitivity 

15 and would not be ratiometric in their detection. They would therefore allow for only the 
sorting of positive cells based on fluorescent intensity. 

Once active genomic polynculeotides have been identified, they can be sequenced 
using various methods, including RACE (rapid amplification of cDNA ends). RACE is a 
procedure for the identification of unknown mRNA sequences that flank known mRNA 

20 sequences. Both 5' and 3' ends can be identified depending on the RACE conditions. 5' 
RACE is done by first preparing RNA from a cell line or tissue of interest. This total or 
polyA RNA is then used as a template for a reverse transcription reactions which can 
either be random primed or primed with a gene-specific primer. A poly nucleotide linker 
of known sequence is then attached to the 3' end of the newly transcribed cDNA by 

25 terminal transferase or RNA ligase. This cDNA is then used as the template for PCR 

using one primer within the reporter gene and the other primer corresponding to sequence 
which had been linked to the 3' end of the first stand cDNA. The present invention is 
particularly well suited for such techniques and does not require construction of 
additional clones or constructs once the genomic polynucleotide has been identified. 
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Substrates for Measuring 0-lactamase Activity 

Any membrane pemianent P-lactamase substrate capable of being measured 
inside the cell after cleavage can be used in the methods and compositions of the 
invention. Membrane permanent P-lactamase substrates will not require permeablizing 
5 eukaryotic cells either by hypotonic shock or by electroporation. Generally, such non- 
specific pore forming methods are not desirable to use in eukaryotic cells because such 
methods injure the cells, thereby decreasing viability and introducing additional variables 
into the screening assay (such as loss of ionic and biological contents of the shocked or 
porated cells). Such methods can be used in cells with cell walls or membranes that 

10 significantly prevent or retard the diffusion of such substrates. Preferably, the membrane 
permeant {3-lactamase substrates are transformed in the cell into a P-lactamase substrate 
of reduced membrane permeability (usually at least five less permeable) or that is 
membrane impermeant. Transformation inside the cell can occur via intracellular 
enzymes (e.g. esterases) or intracellular metabolites or organic molecules (e.g. sulfhydryl 

15 groups). Preferably, such substrates are fluorescent. Fluorescent substrates include those 
capable of changes, either individually or in combination, of total fluorescence, excitation 
or emission spectra or FRET. 

Preferably, FRET type substrates are employed with the methods and 
compositions of the invention. Including fluorogenic substrates of the general formula I: 

20 D-S-A 

wherein D is a FRET donor and A is a FRET acceptor and S is a substrate for a protein 
with p-lactamase activity, p-lactamase activity cleaves either D-S or S-A bonds thereby 
releasing either D or A, respectively firom S. Such cleavage resulting from p-lactaniase 
activity dramatically increases the distance between D and A which usually causes a 
25 complete loss in energy transfer between D and A. Generally, molecules of D-S-A 

structure are constructed to maximize the energy transfer between D and A. Preferably, 
the distance between D and A is generally equal to or less than the R^. 

As would readily be appreciated by those skilled in the art, the efficiency of 
fluorescence resonance energy transfer depends on the fluorescence quantum yield of the 
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donor fluorophore, the donor-acceptor distance and the overlap integral of donor 
fluorescence emission and acceptor absorption. The energy transfer is most efficient 
when a donor fluorophore with high fluorescence quantum yield (preferably, one 
approaching 100%) is paired with an acceptor with a large extinction coefficient at 

5 wavelengths coinciding with the emission of the donor. The dependence of fluorescence 
energy transfer on the above parameters has been reported Forster, T. (1 948) Ann. Physik 
2: 55-75; Lakowicz, J. R., Principles of Fluorescence Spectroscopy, New York: Plenum 
Press (1983); Hemrian, B., Resonance energy transfer microscopy, in: Fluorescence 
Microscopy of Living Cells in Culture, Part B, Methods in Cell Biology, Vol. 30, ed. 

10 Taylor, D.L. & Wang, Y.L., San Diego: Academic Press (1989), pp. 219-243; Turro, N. 
J., Modern Molecular Photochemistry, Menlo Part: Benjamin/Cummings Publishing Co., 
Inc. (1978), pp. 296-361, and tables of spectral overlap integrals are readily available to 
those working in the field for example, Berlman, I.B. Energy transfer parameters of 
aromatic compounds. Academic Press, New York and London (1973). The distance 

15 between donor fluorophore and acceptor dye at which fluorescence resonance energy 
transfer (FRET) occurs with 50% efficiency is termed Ro and can be calculated from the 
spectral overlap integrals. For the donor-acceptor pair fluorescein - tetramethyl 
rhodamine which is frequently used for distance measurement in proteins, this distance R^ 
is around 50-70 A dos Remedios, C.G. et al. (1987) 7. Muscle Research and Cell Motility 

20 8:97-1 1 7. The distance at which the energy u-ansfer in this pair exceeds 90% is about 45 
A. When attached to the cephalosporin backbone the distances between donors and 
acceptors are in the range of 10 A to 20 A, depending on the linkers used and the size of 
the chromophores. For a distance of 20 A, a chromophore pair will have to have a 
calculated Ro of larger than 30 A for 90% of the donors to transfer their energy to the 

25 acceptor, resulting in better than 90% quenching of the donor fluorescence. Cleavage of 
such a cephalosporin byprlactamase relieves quenching and produces an increase in donor 
fluorescence efficiency in excess of tenfold. Accordingly, it is apparent that 
identification of appropriate donor-acceptor pairs for use as taught herein in accordance 
with the present invention would be essentially routine to one skilled in the art. 
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Reporting gene substrates described in Tsien et al., PCT Publication No. 
WO96/30540 published October 3, 1996 are preferred for ^lactamase. 
Fluorescence Measurements 

When using fluorescent substrates, it will recognized that different types of 
fluorescent monitoring systems can be used to practice the invention. Preferably, FACS 
systems are used or systems dedicated to high throughput screening e.g., 96 well or 
greater microtiter plates. Methods of performing assays on fluorescent materials are well 
known in the art and are described in, e.g., Lakowicz, J. R., Principles of Fluorescence 
Spectroscopy, New York: Plenum Press (1983); Herman, B., Resonance energy transfer 
microscopy, in: Fluorescence Microscopy of Living Cells in Culture, Part B, Methods in 
Cell Biology, vol. 30, ed. Taylor, D.L. & Wang, Y. L., San Diego: Academic Press 
(1989), pp. 219-243; Turro, N. J., Modern Molecular Photochemistry, Menlo Park: 
Benjamin/Cummings Publishing Col, Inc. (1978), pp. 296-361. 

Fluorescence in a sample can be measured using a fluorimeter. In general, 
excitation radiation, from an excitation source having a first wavelength, passes through 
excitation optics. The excitation optics cause the excitation radiation to excite the 
sample. In response, fluorescent proteins in the sample emit radiation that has a 
wavelength that is different from the excitation wavelength. Collection optics then 
collect the emission from the sample. The device can include a temperature controller to 
maintain the sample at a specific temperature while it is being scanned. According to one 
embodiment, a multi-axis translation stage moves a microtiter plate holding a plurality of 
samples in order to position different wells to be exposed. The multi-axis translation 
stage, temperature controller, auto-focusing feature, and electronics associated with" 
imaging and data collection can be managed by an appropriately programmed digital 
computer. The computer also can transform the data collected during the assay into 
another format for presentation. 

Preferably, FRET is used as a way of monitoring P-lactamase activity inside a 
cell. The degree of FRET can be determined by any spectral or fluorescence lifetime 
characteristic of the excited construct, for example, by determining the intensity of the 
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fluorescent signal from the donor, the intensity of fluorescent signal from the acceptor, 
the ratio of the fluorescence amplitudes near the acceptor's emission maxima to the 
fluorescence amplitudes near the donor's emission maximum, or the excited state lifetime 
of the donor. For example, cleavage of the linker increases the intensity of fluorescence 
5 from the donor, decreases the intensity of fluorescence from the acceptor, decreases the 
ratio of fluorescence amplitudes from the acceptor to that from the donor, and increases 
the excited state lifetime of the donor 

Preferably, changes in the degree of FRET are determined as a function of the 
change in the ratio of the amount of fluorescence from the donor and acceptor moieties, a 

10 process referred to as "ratioing." Changes in the absolute amount of substrate, excitation 
intensity, and turbidity or other background absorbances in the sample at the excitation 
wavelength affect the intensities of fluorescence from both the donor and acceptor 
approximately in parallel. Therefore the ratio of the two emission intensities is a more 
robust and preferred measure of cleavage than either intensity alone. 

15 The excitation state Hfetime of the donor moiety is, likewise, independent of the 

absolute amount of substrate, excitation intensity, or turbidity or other background 
absorbances. Its measurement requires equipment with nanosecond time resolution, 
except in the special case of lanthanide complexes in which case microsecond to 
millisecond resolution is sufficient. 

20 The ratio-metric fluorescent reporter system described herein has significant 

advantages over existing reporters for gene integration analysis, as it allows sensitive 
detection and isolation of both expressing and non-expressing single living cells. This 
assay system uses a non-toxic, non-polar fluorescent substrate that is easily loaded and 
then trapped intracellularly. Cleavage of the fluorescent substrate by (i-lactamase yields a 

25 fluorescent emission shift as substrate is converted to product. Because the P-lactamase 
reporter readout is ratiometric it is unique among reporter gene assays in that it controls 
for variables such as the amount of substrate loaded into individual cells. The stable, 
easily detected, intracellular readout eliminates the need for establishing clonal cell lines 
prior to expression analysis. With the P-lactamase reporter system or other analogous 
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systems flow sorting can be used to isolate both expressing and non-expressing cells from 
pools of millions of viable cells. This positive and negative selection allows its use with 
gene identification methods to isolate desired clones from large clone pools containing 
millions of cells each containing a unique integration site. 
5 High Throughput Screening System 

The present invention can be used with systems and methods that utilize 
automated and integratable workstations for identifying modulators, pathways, chemicals 
having useful activity and other methods described herein. Such systems are described 
generally in the art (see, U.S. Patent Nos: 4,000,976 to Kramer et al. (issued January 4, 

10 1977), 5,104,621 to Pfost et al. (issued April 14, 1992), 5,125,748 to Bjomson et al. 
(issued June 30, 1992), 5,139,744 to Kowalski (issued August 18, 1992), 5,206,568 
Bjomson et al. (issued April 27, 1993), 5,350,564 to Mazza et al. (September 27, 1994), 
5,589,351 to Harootunian (issued December 31, 1996), and PCT Application Nos: WO 
93/20612 to Baxter Deutschland GMBH (published October 14, 1993), WO 96/05488 to 

15 McNeil et al. (published February 22, 1996) and WO 93/13423 to Agong et al. (published 
July 8, 1993). 

Typically, such a system includes: A) a storage and retrieval module comprising 
storage locations for storing a plurality of chemicals in solution in addressable wells, a 
well retriever and having programmable selection and retrieval of the addressable wells 

20 and having a storage capacity for at least 10,000 the addressable wells, B) a sample 
distribution module comprising a liquid handler to aspirate or dispense solutions from 
selected the addressable wells, the chemical distribution module having programmable 
selection of, and aspiration from, the selected addressable wells and programmable* 
dispensation into selected addressable wells (including dispensation into arrays of 

25 addressable wells with different densities of addressable wells per centimeter squared), C) 
a sample transporter to transport the selected addressable wells to the sample distribution 
module and optionally having programmable control of transport of the selected 
addressable wells (including adaptive routing and parallel processing), D) a reaction 
module comprising either a reagent dispenser to dispense reagents into the selected 
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addressable wells or a fluorescent detector to detect chemical reactions in the selected 
addressable wells, and. a data processing and integration module. The addressable wells 
should be made of biocompatable materials that are also compatible with the assay to be 
performed (see, U.S. Patent Application Attorney Docket No.: 08366/008001 /'Systems 
5 and methods for rapidly identifying useful chemicals in liquid samples" (Stylli et al., filed 
May 16, 1997), which is incorporated herein by reference. 

The storage and retrieval module, the sample distribution module, and the reaction 
module are integrated and programmably controlled by the data processing and 
integration module. The storage and retrieval module, the sample distribution module, 

10 the sample transporter, the reaction module and the data processing and integration 

module are operably linked to facilitate rapid processing of the addressable sample wells. 
Typically, devices of the invention can process about 10,000 to 100,000 addressable 
wells, which can represent about 5,000 to 100,000 chemicals, in 24-hour period. Cells 
clones generated using the present invention can be individually deposited into wells of a 

15 multi-well platform having any number of wells, such as 96, 864, 3456, or more. The 
cells in the wells can be cultured, stored, screened, and inventoried using such a system. 

The present invention is also directed to chemical entities and information (e.g., 
modulators or chemicals or databases biological activities of chemicals or targets) 
generated or discovered by operation of the present invention, particularly chemicals and 

20 information generated using such systems. 

Pharmacology and Toxicity of Candidate Modulators 

The structure of a candidate modulator identified by the invention can be 
determined or confirmed by methods known in the art, such as mass spectroscopy. For 
putative modulators stored for extended periods of time, the structure, activity, and 

25 potency of the putative modulator can be confirmed. 

Depending on the system used to identify a candidate modulator, the candidate 
modulator will have putative pharmacological activity. For example, if the candidate 
modulator is found to inhibit T-cell proliferation (activation) in vitro, then the candidate 
modulator would have presumptive pharmacological properties as an immunosuppressant 
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or anti-inflammatory (see, Suthanthiran et al., Am. J. Kidney Disease . 28:159-1 72 
(1996)). Such nexuses are known in the art for several disease states, and more are 
expected to be discovered over time. Based on such nexuses, appropriate confirmatory in 
vitro and in vivo models of pharmacological activity, as well as toxicology, can be 
5 selected. The methods described herein can also be used to assess pharmacological 
selectivity and specificity, and toxicity. 

■■■■■■■■■■■•■■■•■■•■•■■■■•■■•■■■■■■■■■■■■•■■••■■■■■■■••■■■■■■■■■■■■■■■■■I 

Once identified, candidate modulators can be evaluated for toxicological effects 

using known methods (see, Lu, Basic Toxicology, Fundamentals, Target Oreans, and 
Risk Assessment , Hemisphere Publishing Corp., Washington (1985); U.S. Patent Nos: 

10 5,196,313 to Culbreth (issued March 23, 1993) and U.S. Patent No. 5,567,952 to Benet 
(issued October 22, 1996). For example, toxicology of a candidate modulator can be 
established by determining in vitro toxicity towards a cell line, such as a mammalian i.e. 
himian, cell line. Candidate modulators can be treated with, for example, tissue extracts, 
such as preparations of liver, such as microsomal preparations, to determine increased or 

15 decreased toxicological properties of the chemical after being metabolized by a whole 
organism. The results of these types of studies are often predictive of toxicological 
properties of chemicals in animals, such as mammals, including humans. 

Alternatively, or in addition to these in vitro studies, the toxicological properties 
of a candidate modulator in an animal model, such as mice, rats, rabbits, or monkeys, can 

20 be determined using established methods (see, Lu, supra (1985); and Creasey, Drug 

Disposition in Humans, The Basis of Clinical Pharmacology , Oxford University Press, 
Oxford (1979)). Depending on the toxicity, target organ, tissue, locus, and presumptive 
mechanism of the candidate modulator, the skilled artisan would not be burdened to 
determine appropriate doses, LD50 values, routes of administration, and regimes that 

25 would be appropriate to determine the toxicological properties of the cemdidate 
modulator. In addition to animal models, human clinical trials can be performed 
following established procedures, such as those set forth by the United States Food and 
Drug Administration (USFDA) or equivalents of other governments. These toxicity 
studies provide the basis for determining the efficacy of a candidate modulator in vivo. 


SDOCID: <WO 9813353A1J_> 


k wo 98/13353 PCT/US97/17395 

Efficacy of Candidate Modulators 

Efficacy of a candidate modulator can be established using several art recognized 
methods, such as in vitro methods, animal models, or human clinical trials (see, Creasey, 
supra (1979)). Recognized in vitro models exist for several diseases or conditions. For 
5 example, the abihty of a chemical to extend the life-span of HIV-infected cells in vitro is 
recognized as an acceptable model to identify chemicals expected to be efficacious to 
treat HIV infection or AIDS (see, Daluge et al., Antimicro. Agents Chemother. 41:1 082- 
1093 (1995)). Furthermore, the ability of cyclosporin A (CsA) to prevent proliferation of 
T-cells in vitro has been established as an acceptable model to identify chemicals 

10 expected to be efficacious as immunosuppressants (see, Suthanthiran et al., supra, 

(1996)). For nearly every class of therapeutic, disease, or condition, an acceptable in 
vitro or animal model is available. Such models exist, for example, for gastro-intestinal 
disorders, cancers, cardiology, neurobiology, and immunology. In addition, these in vitro 
methods can use tissue extracts, such as preparations of liver, such as microsomal 

15 preparations, to provide a reliable indication of the effects of metabolism on the candidate 
modulator. Similarly, acceptable animal models may be used to establish efficacy of 
chemicals to treat various diseases or conditions. For example, the rabbit knee is an 
accepted model for testing chemicals for efficacy in treating arthritis (see, Shaw and 
T.acv^ J. Bone Joint Surg. (Br) 55:197-205 (1973)). Hydrocortisone, which is approved 

20 for use in humans to treat arthritis, is efficacious in this model which confirms the 
validity of this model (see, McDonough, Phvs. Ther. 62:835-839 (1982)). When 
choosing an appropriate model to determine efficacy of a candidate modulator, the skilled 
artisan can be guided by the state of the art to choose an appropriate model, dose, and 
route of administration, regime, and endpoint and as such would not be unduly burdened 

25 In addition to animal models, human clinical trials can be used to determine the 

efficacy of a candidate modulator in humans. The USFDA, or equivalent governmental 
agencies, have established procedures for such studies. 
Selectivity of Candidate Modulators 
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The in vitro and in vivo methods described above also establish the selectivity of a 
candidate modulator. It is recognized that chemicals can modulate a wide variety of 
biological processes or be selective. Panels of cells based on the present invention can be 
used to determine the specificity of the candidate modulator. Selectivity is evident, for 

5 example, in the field of chemotherapy, where the selectivity of a chemical to be toxic 
towards cancerous cells, but not towards non-cancerous cells, is obviously desirable. 
Selective modulators are preferable because they have fewer side effects in the clinical 
setting. The selectivity of a candidate modulator can be established in vitro by testing the 
toxicity and effect of a candidate modulator on a plurahty of cell lines that exhibit a 

10 variety of cellular pathways and sensitivities. The data obtained from these in vitro 

toxicity studies can be extended animal model studies, including human clinical trials, to 
determine toxicity, efficacy, and selectivity of the candidate modulator. 

The selectivity, specificity and toxicology, as well as the general pharmacology, 
of a test chemical can be often improved by generating additional test chemicals based on 

15 the structure/property relationships of the test chemical originally identified as having 
activity (a "Hit"). Test chemicals identified as having activity can be modified to 
improve various properties, such as affinity, life-time in the blood, toxicology, specificity 
and membrane permeability. Such refined test chemicals can be subjected to additional 
assays as described herein for activity analysis. Methods for generating and analyzing 

20 such chemicals are known in the art, such as U.S. patent 5,574,656 to Agrafiotis et al. 
Compositions 

The present invention also encompasses a modulator in a pharmaceutical 
composition comprising a pharmaceutically acceptable carrier prepared for storage and 
subsequent administration, which have a pharmaceutically effective amount of the 
25 candidate modulator in a pharmaceutically acceptable carrier or diluent. Chemicals 

identified by the methods described herein do not include chemicals publicly available as 
of the filing date of the present application or in the prior art. Acceptable carriers or 
diluents for therapeutic use are well knowoi in the pharmaceutical art, and are described, 
for example, in Remington's Pharmaceutical Sciences , Mack Publishing Co. (A.R. 
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Gennaro edit. 1985). Preservatives, stabilizers, dyes and even flavoring agents may be 
provided in the pharmaceutical composition. For example, sodium benzoate, sorbic acid 
and esters of p-hydroxybenzoic acid may be added as preservatives. In addition, 
antioxidants and suspending agents may be used. 
5 The compositions of the present invention may be formulated and used as tablets, 

capsules or elixirs for oral administration; suppositories for rectal administration; sterile 
solutions, suspensions for injectable administration; and the like. Injectables can be 
prepared in conventional forms either as liquid solutions or suspensions, solid forms 
suitable for solution or suspension in liquid prior to injection, or as emulsions. Suitable 

10 excipiems are, for example, water, saline, dextrose, manmiol, lactose, lecithin, albumin, 
sodium glutamate, cysteine hydrochloride, and the like. In addition, if desired, the 
injectable pharmaceutical compositions may contain minor amounts of nontoxic auxiliary 
substances, such as wetting agents, pH buffering agents, and the like. If desired, 
absorption enhancing preparations (e.g., liposomes), may be utilized. 

15 The pharmaceutically effective amount of the candidate modulator required as a 

dose will depend on the route of administration, the type of animal being treated, and the 
physical characteristics of the specific animal under consideration. The dose can be 
tailored to achieve a desired effect, but will depend on such factors as weight, diet, 
concurrent medication and other factors which those skilled in the medical arts will 

20 recognize. In practicing the methods of the invention, the pharmaceutical compositions 
can be used alone or in combination with one another, or in combination with other 
therapeutic or diagnostic agents. These products can be utilized in vivo, ordinarily in a 
mammal, preferably in a human, or in vitro. In employing them in vivo, the 
pharmaceutical composition can be administered to the mammal in a variety of ways, 

25 including parenterally, intravenously, subcutaneously, intramuscularly, colonically, 

rectally, nasally or intraperitoneally, employing a variety of dosage forms. Such methods 
may also be applied to testing chemical activity in vivo. 

As will be readily apparent to one skilled in the art, the useful in vivo dosage to be 
administered and the particular mode of administration will vary depending upon the age. 
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weight and mammalian species treated, the particular pharmaceutical composition 
employed, and the specific use for which the pharmaceutical composition is employed. 
The determination of effective dosage levels, that is the dosage levels necessary to 
achieve the desired result, can be accomphshed by one skilled in the art using routine 
5 methods as discussed above. Typically, human clinical applications of products are 
commenced at lower dosage levels, with dosage level being increased until the desired 
effect is achieved. Alternatively, acceptable in vitro studies can be used to estabHsh 
useful doses and routes of administration of the compositions identified by the present 
methods using established pharmacological methods. 
10 In non-human animal studies, applications of potential pharmaceutical 

compositions are commenced at higher dosage levels, with the dosage being decreased 
until the desired effect is no longer achieved or adverse side effects are reduced or 
disappear. The dosage for the products of the present invention can range broadly 
depending upon the desired affects and the therapeutic indication. Typically, dosages 
15 may be between about 10 ng/kg and ^g/kg body weight, preferably between about 100 
g and 10 mg/kg body weight. Administration is preferably oral on a daily basis. 
The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition, (see e.g., Fingl et al., in The 
Pharmacological Basis of Therapeutics . 1975). It should be noted that the attending 
20 physician would know how to and when to terminate, interrupt, or adjust administration 
due to toxicity, organ dysfunction, or other adverse effects. Conversely, the attending 
physician would also know to adjust treatment to higher levels if the clinical response 
were not adequate (precluding toxicity). The magnitude of an administrated dose in the 
management of the disorder of interest will vary with the severity of the condition to be 
25 treated and to the route of administration. The severity of the condition may, for 

example, be evaluated, in part, by standard prognostic evaluation methods. Further, the 
dose and perhaps dose frequency, will also vary according to the age, body weight, and 
response of the individual patient. A program comparable to that discussed above may be 
used in veterinary medicine. 
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Depending on the specific conditions being treated, such pharmaceutical 
compositions may be formulated and administered systemically or locally. Techniques 
for formulation and administration may be found in Remington^s Pharmaceutical 
Sciences , 18th Ed., Mack Publishing Co., Easton, PA (1990). Suitable routes may 
5 include oral, rectal, transdermal, vaginal, transmucosal, or intestinal administration; 

parenteral delivery, including intramuscular, subcutaneous, intramedullary injections, as 
well as intrathecal, direct intraventricular, intravenous, intraperitoneal, intranasal, or 

intraocular injections. 

For injection, the pharmaceutical compositions of the invention may be 

10 fomiulated in aqueous solutions, preferably in physiologically compatible buffers such as 
Hanks' solution, Ringer's solution, or physiological saline buffer. For such transmucosal 
administration, penetrants appropriate to the barrier to be permeated are used in the 
formulation. Such penetrants are generally known in the art. Use of pharmaceutically 
acceptable carriers to fomiulate the pharmaceutical compositions herein disclosed for the 

15 practice of the invention into dosages suitable for systemic administration is within the 

scope of the invention. With proper choice of carrier and suitable manufacturing practice, 
the compositions of the present invention, in particular, those fomiulated as solutions, 
may be administered parenterally, such as by intravenous injection. The pharmaceutical 
compositions can be formulated readily using pharmaceutically acceptable carriers well 

20 known in the art into dosages suitable for oral administration. Such carriers enable the 
chemicals of the invention to be formulated as tablets, pills, capsules, liquids, gels, 
syrups, slurries, suspensions and the like, for oral ingestion by a patient to be treated. 

Agents intended to be administered intracellularly may be administered using 
techniques well known to those of ordinary skill in the art. For example, such agents may- 

25 be encapsulated into liposomes, then administered as described above. All molecules 

present in an aqueous solution at the time of liposome formation are incorporated into the 
aqueous interior. The liposomal contents are both protected from the external micro- 
environment and, because liposomes fuse with cell membranes, are efficiently delivered 
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into the cell cytoplasm. Additionally, due to their hydrophobicity, small organic 
molecules may be directly administered intracellularly. 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to 

5 achieve its intended purpose. Determination of the effective amount of a pharmaceutical 
composition is well within the capability of those skilled in the art, especially in light of 
the detailed disclosure provided herein. In addition to the active ingredients, these 
pharmaceutical compositions may contain suitable pharmaceutically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active chemicals 

10 into preparations which can be used pharmaceutically. The preparations formulated for 
oral administration may be in the form of tablets, dragees, capsules, or solutions. The 
pharmaceutical compositions of the present invention may be manufactured in a manner 
that is itself known, e.g., by means of conventional mixing, dissolving, granulating, 
dragee-making, levitating, emulsifying, encapsulating, entrapping, or lyophilizing 

15 processes. Pharmaceutical formulations for parenteral administration include aqueous 
solutions of the active chemicals in water-soluble form. Additionally, suspensions of the 
active chemicals may be prepared as appropriate oily injection suspensions. Suitable 
lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid 
esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions 

20 may contain substances which increase the viscosity of the suspension, such as sodium 
carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may also 
contain suitable stabilizers or agents that increase the solubility of the chemicals to allow 
for the preparation of highly concentrated solutions. 

Pharmaceutical compositions for oral use can be obtained by combining the active- 

25 chemicals with solid excipient, optionally grinding a resulting mixture, and processing 
the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or 
dragee cores. Suitable excipients are, in particular, fillers such as sugars, including 
lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize 
starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose. 
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hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or 
polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the 
cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium 
alginate. Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions may be used, which may optionally contain gum arabic, taJc, 
polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dye-stuffs or pigments may 
be added to the tablets or dragee coatings for identification or to characterize different 
combinations of active chemical doses. 
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EXAMPLES 

Example 1 |p-Iactamase Expression Constructs 

To investigate various beta-lactamase expression constructs (BLECs) 
multiple BLECs were constructed and transfected into mammalian cells. 
5 The first of these, BLEC-1 was constructed by cloning the cytoplasmic 

form of 6-lactamase SEQJD NO. 4 (see Table 1) such that it is functionally 
linked to the En-2 splice acceptor sequence, as shown in FIG. 3. This vector when 
inserted into a genomic intron will result in the generation of a fusion RNA 
between an endogenous target gene and^lactamase ("BL"). BLEC-1 also 

10 contains a bovine growth hormone poly-adenlyation sequence (BGH-polyA) 
downstream of the cytoplasmic Beta-lactamase. 

BLEC-2 was constructed identically to BLEC-1, except that a poliovirus 
internal ribosomal entry site (IRES) sequence was inserted between the En-2 
splice acceptor^lactamase("BL"). This eliminates reading frame restrictions and 

15 possible inactivation of beta-lactamase by fusion to an endogenous protein. To 
allow for selection of stable transfectants for BLEC-1 and BLEC-2 a neomycin 
or G418 resistance cassette was cloned downstream of the BGH poly-adenylation 
sequence. This cassette consists of a promoter, neomycin resistance gene and an 
SV40 poly- adenylation sequence, as shown in FIG. 3. 

20 Two alternative constructs BLEC-3 and BLEC-4 were constructed similar 

to BLEC-1, and BLEC-2 respectively, except the SV40-poly A was replaced 
with a splice donor sequence. This should enrich for insertion into transcribed 
regions, as it requires the presence of an endogenous splice acceptor and 
polyadenylation sequence downstream of the vector insertion site to generate 

25 G418 resistant clones. BLEC-3 and BLEC-4 also use the PGK promoter to drive 
the neomycin resistance gene instead of the human beta-actin promoter. 
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The structure of CCF2-AM (BL substrate) used in the experiments below 

is: 


O. 



Table 2 


SEQ. 
ID NO. 

parent -BL gene 
and reference 

modification 

mammalian 
expression vector 

location of 
expression 

# 1. 

Escherichia coli 
RTEM 

Kadonaga et al. 

Signal sequence replaced by: 
ATG ACT 

pMAM-neo 
glucocorticoid- 
inducible 

cytoplasmic 

# 2 

Escherichia coli 
RTEM 

Kadonaga et al. 

Wild type secreted enzyme 
2 changes in pre-sequence: 
ser 2 arg » ala 23 gly 

pMAM-neo 
glucocorticoid - 
inducible 

secreted 
extracellularly 

#3 

Escherichia coli 
RTEM 

-globin up stream leader: 
AAGCn i i IGCAGAAGCTCA 
GAATAAACGCAACTTTCCG 
Kozak sequence: 
GGTACCACCATGG 
signal sequence replaced by: 
ATG GGG 

pCDNA 3 
CMV promotor 

and 

pZEO 

SV40 promotor 

cytoplasmic 

# 4 

Escherichia coli 
RTEM 

Kozak sequence: 

GGTACCACCATGG 

signal sequence replaced by: 

ATG GAC 

(GAC replaces CAT) 

pCDNA3 CMV 
promoter 

AND 

BLECs 

cytoplasmic 

# 5 

Bacillus 

licheniformis 749/C 
Neugebauer et al. 

signal sequence removed, 
new N-temiinal ATG 

pCDNA 3 
CMV promotor 

cytoplasmic 
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Table 3 
Functional Elements 
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Splice acceptor Adapter 


VECTORS 


BLEC-1 

En2-spiicc 

protein 


acceptor 

fusion 

BLEC-2 

En2-splice 

IRES 


acceptor 


BLEC-3 

En2-splice 

Protein 


acceptor 

fusion 

BLEC-4 

En2-splicc 

IRES 


acceptor 



Reporter gene 


Reponer 
gene 
poly A 


SEQ. ID NO. 4 BGH polyA 
SEQ. ID NO. 4 BGH polyA 
SEQ. ID NO. 4 BGH polyA 


Selection 
Promoter 


p-actin promoter 
p-actin promoter 
PGK promoter 


SEQ. ID NO. 4 BGH polyA PGK promoter 


Resistant 
Marker 
poly A 


Neo 

polyA 

Neo- 

poiyA 

Neo- 

splice 

donor 

Neo- 

splice 

donor 


Example 2 Libraries of BLEC Clones 

To investigate the function of each of the BLEC vectors they were 
trans fected by electroporation into RBL-1 cells and stable clones were selected for 
each of the four BLEC plasmids (see Table 2). Selective media contained 

10 DMEM, 10% fetal bovine serum (FBS) and 400 ^ig/ml Geneticin (G418). G418 
resistant cell clones were pooled from multiple transfections to generate a library 
of BLEC stable integrated clones. 

This library of BLEC-1 integrated clones was loaded with the fluorescent 
substrate of BL (CCF-2-AM) by adding \0^xM CCF-2-AM in HBSS containing 

15 10|aM hepes 7. 1 and 1% glucose. After a 1 hour incubation at 22*^C cells were 
washed with HBSS and viewed upon excitation with 400nm light using a 435nm 
long pass emission filter. Under these assay conditions 10% of the cells were blue 
fluorescent indicating they were expressing p-lactamase. This result suggests that 
that BLEC-1 construct is functioning as a gene integration vector. 

20 Stable cell lines were also generated by transfecting BLEC-1 into CHO- 

Kl and Jurkat cells. Populations of BLEC-1 integrated clones from CHO and 
Jurkat cells showed similar results to those obtained with RBL-1 clones with 10- 
15 % of BLEC integrated cell clones expressing BL as determined by their 
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blue/green ratio after loading with CCF-2-AM. This result shows that the BLECs 
function in a variety of cell types including human T-cells (Jurkat), rat basophilic 
leukocytes (RBL), and Chinese hamster ovarian (CHO). 

5 Example 3 Isolating BLEC Clones Expressing^lactamase 

Fluorescent activated cell sorting of multi-clonal populations of RBL-1 
gene integrated clones was used to identify clones with regulated BL gene 
expression. A BL non-expressing population of cells was isolated by sorting a 
library of BLEC-1 integrated clones generated by transfection of RBL-1 cells as 

1 0 descnbed in Example 2. 1 80,000 clones expressing little or no BL were isolated 
by sorting. for clones with a low blue/green ratio (Rl population), as shown in 
FIG. 4A. This population of clones was grown for seven days and resorted by 
FACS to test the population's fluorescent properties. FACS analysis of the cell 
clones sorted from Rl shows that most of the cells with a high blue/green ratio 

15 -0.1 % have been removed by one round of sorting for green cells, as shown in 
FIG. 4B. It is also clear that the total population has shifted towards more green 
cells compared to the parent population, as shown in FIG. 4A. There are, 
however, cells with a high blue/green ratio showing up in the green sorted 
population. These may represent clones in which the BLEC has integrated into a 

20 differentially regulated gene such as a gene whose expression changes throughout 
the cell cycle. 

The population of RBL-1 clones shown in FIG. 4B was stimulated by 
addition of luM ionomycin for 6 hours and resorted to identify clones which had 
the BLEC integrated into a gene which is inducible by increasing intracellular 
25 calcium. Table 3 below summarizes the results firom this experiment. A greater 
percentage of blue clones were present in all three of the blue sub-population (R4, 
R2, R5) in the ionomycin stimulated when compared to the unstimulated 
population. This sorted population represents the following classes of blue cells: 
R4 (highest blue/green ratio (bright blues)), R2 (multicolor blues), and R5 (lower 
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blue/green ratio (least blue). Additionally, in the ionomycin stimulated 
population there is a decrease in the percent green cells from the unstimulated 
population (R6). This increase in blue clones in the ionomycin stimulated 
population indicates that a sub-population of blue clones have the BLEC inserted 
into a gene which is induced by ionomycin. Individual blue clones were sorted 
from the ionomycin stimulated population and are analyzed for their expression 
profile. 

Table 4 
Sort Window (See FIG, 4) 

R4 R2 R5 R6 (green) 

(blue) 

Unstimulated % .11 2.39 1.53 66.23 

luM Ionomycin .24 3.5 2.5 61.64 

Stimulated % 

Ratio +lon/-Ion 2.2 1.5 1.6 ,9 

In addition to allowing the isolation of cell clones with inducible BL 
expression from large populations of cells, clones can be isolated based their level 
of BL expression. To isolate cells with different levels of BL expressions blue 

15 clones can be sorted after different exposure times to substrate or by their 
blue/green ratio. Cell with a lower blue/green ratio or those requiring longer 
incubation times will represent clones expressing lower levels of BL. This is 
demonstrated by the FACS scan above as clones sorted from the R4 window have 
a higher blue/green ration indicating they are expressing higher levels of BL, cells 

20 sorted from the R5 have a lower blue/green ratio (visually turquoise) indicating 
lower BL expression. Cell sorted from the R3 window which contain all the blue 
cells show variation in blue color from bright blue (high blue/green ratio) to 
turquoise blue (low blue/green ratio). 

To demonstrate that the expression constructs are relatively stable for 

25 sorted clones cells were sorted from R3 (blue population) as shown in FIG. 4A 
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and cultured in the absence of selective pressure for several v^eeks. There was 
little change in the percent of blue cells in the cultured population with the percent 
blue being maintained at •-'90%. This result represents a 1 0-fold enrichment for 
clones constitutively expressing BL by one round of FACS selection. 
5 Cells in R6 window have the lowest blue/green ration and appear green 

visually. R6 cell is therefore not expressing BL or are expressing BL below the 
detection limit of our assay. 

Example 4 Stability of BLEC Clones 

10 To further investigate the stability of reporter gene integrations into 

constitutively active genes, single blue clones were sorted from cell clone 
populations generated by transfecting RBL-1, and CHOKl with BLEC-L After 
addition of CCF-2 to the multi-clonal cell population, single blue clones were 
sorted into 96 well microtiter plates. These clones were expanded to 24 well 
15 dishes which took 7-10 days. The cell viability varied between the two cell types 
with 80% of the sorted clones forming colonies for the CHO and 36% for the 
RBL-1 cells. After expansion into a 24 well dishes 20 CHO BLEC-1 stable 
clones were tested for BL expression by addition of CCF-2 -AM. 20/20 of these 
clones expressed BL with the percent blue cells within a clone ranging from 70% 
20 to 99%. This result is consisted with the earlier data presented for RBL-1 in which 
the blue sorted population was tested for BL expression after several weeks of 
non-selective culturing. There was however a significant differences between 
clones in their blue/green ratio and hence their level of BL expression. This 
suggested that genes with different levels of constitutive expression had been 
25 tagged with the BLEC. Although there was a significant differences in blue color 
between separate clones the blue fluorescence within a clone was consistently 
similar as would be expected in a clonal population. There were however green 
cells within the blue sorted clones, which may indicate that there is some loss of 
the BLEC-1 plasmid integration site when clones are grown up from a single cell. 
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Single clones were expanded and used to make RNA for RACE to identify 
the target gene and DNA for southern analysis. 

Example 5 Isolation of Jurkat BLEC integrated clones that constitutively express 
5 beta-Iactamase 

Jurkat cells are a T-cell line derived from a human T-cell leukemia. This cell line 
maintains many of the signaling capabilities of primary T-cells and can be activated using 
anti-CD3 antibodies ormitogenic lectins such as phytohemaglutinin (PHA). Wild type 
Jurkat cells were transfected by electroporation with a beta-lactamase trapping construct 

10 (BLEC-1, BLEC-IA, or BLEC-IB see FIG. 3) ("BLEC constructs") that contains a 
gene encoding an beta-lactamase gene that is not under control of a promoter recognized 
by the Jurkat cells and a neomycin resistance gene that can be expressed in Jurkat cells. 
BLEC-1 is set forth in FIG- 3. BLEC-1 A has a NotI site after the SV40 poly A site. 
This allows the cutting of the insert away form the plasmid backbone. BLEC-IB is the 

15 same as BLEC-1 A except that the ATG at the beta-lactamase translation start has been 
changed to ATC. This eliminated the translation start site and requires the addition of an 
upstream ATG to produce beta-lactamase. Stable transformants were selected for their 
resistance to 800^g/ml G418. After 400 separate experiments, a pool of greater than one 
million clones with BLEC insertions was produced. This population of cells is a library 

20 of cell clones in which the BLEC construct inserted throughout the genome ("Jurkat 
BLEC library"). Approximately ten percent of the cells in this library express beta- 
lactamase in the absence of added stimuli. Beta-lactamase activity in the cells was 
determined by contacting the cells with CCF2-AM and loading in the presence of 
Pluronic 128 (from Sigma) at a about 10(^g/ml.. Individual clones or populations of 

25 cells that express beta-lactamase can be obtained by FACS sorting. 

Genomic Southern analysis of these clones using a DNA probe encoding beta- 
lactamase showed the vector inserted into the host genome between one and three times 
per cell, with most clones having one or two vector insertion sites (for Genomic Southern 
analyses, see Sambrook, Molecular Cloning. A Laboratory Manual , Cold Spring Harbor 
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Laboratory Press (1989)). Northern analysis of these clones using a DNA probe that 
encodes beta- lactamase showed that the level of expression and message size varied from 
clone to clone (for Northern analysis, see Sambrook, supra , (1989)). This indicated that, 
fusion transcripts were being made with different genes functionally tagged with beta- 
5 lactamase, which allows for the reporter gene to be expressed under the same conditions 
as the endogenous gene. Using appropriate primers, RACE (Gibco BRL) was used to 
isolate the genes Hnked to the expressed beta-lactamase gene in a subset of these 
constitutively expressing clones. These genes were cloned and sequenced using icnown 
methods (see, Sambrook, supra , (1989)). These sequences were compared with known 
10 sequences using established BLAST search techniques. Known sequences that were 
identified included: beta-catenin, moesin, and P-adaptin. Additionally, several novel 
sequences were identified which represent putative genes. 

Example 6 Isolation of Jurkat BLEC integrated clones that show induced 

15 expression of beta-lactamase upon activation 

Jurkat BLEC integrated clones that exhibit beta-laciamasc expression upon 
activation of the Jurkat cells by PHA (PHA induced clones) were isolated by FACS 
sorting a Jurkat BLEC library. These clones represent cells in which the trapping 
construct had integrated into a gene up regulated by PHA (T-cell) activation. Thus, these 

20 cells report the transcriptional activation of a gene upon cellular activation. Individual 
clones were identified and isolated by FACS using CCF2-AM to detect beta-lactamase 
activity. This clone isolation method, the induced sorting paradigm, used three sequential 
and independent stimulation and sorting protocols. A FACS read out for Jurkat cells that 
don't contain a BLEC construct contacted with CCF2-AM was used as a control. These 

25 control cells were all green. 

The first sorting procedure isolated a pool of blue (P-lactamase expressing, as 
indicated by contacting the cells with CCF2-AM) clones which had been pre-stimulated 
for 18 hours with lO^g/ml PHA from an unsorted Jurkat BLEC library. This pool 
represented 2.83 % of the original unsorted cell population. This selected pool contained 
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clones that constitutively express beta-lactamase and clones in which the beta-lactamase 
expression was induced by PHA stimulation ("stimulatable clones''). After soning, this 
pool of clones was cultured in the absence of PHA to allow the cells, in the case of 
stimulatable clones, to expand and return to a resting state (i.e. lacking PHA induced gene 
5 expression). 

The second sorting procedure isolated a pool of green (non-P-lactamase 
expressing, as indicated by contacting the cells with CCF2-AM) cell clones from the first 
sorted pool that had been grown, post-sorting, without PHA stimulation for 7 days. The 
second sorting procedure separates clones that constitutively express beta-lactamase from 

10 cells that express beta-lactamase upon stimulation. This second pool represented 1 1.59% 
of the population of cells prior to the second sort. This pool of cells was cultured in the 
absence of PHA to amplify the cell number prior to a third sort. 

The third sorting procedure used the same procedure as the first sorting procedure 
and was used to isolate individual cells that express beta-lactamase in response to being 

15 contacted with lO^^/ml PHA for 18 hours. Single blue clones were sorted individually 
into single wells of 96 well microtiter plates. This three round FACS sorting procedure 
enriched PHA inducible clones about 10,030 fold. 

These isolated clones were expanded and tested for PHA inducibility by 
microscopic inspection with and without PHA stimulation in the presence of CCF2-AM. 

20 A total of fifty-five PHA inducible clones were identified using this procedure. The PHA 
inducibility for these clones ranged from a 1.5 to 40 fold change in the 460/530 ratio as 
compared to unstimulated control cells. Genomic Southern analysis using a DNA probe 
encoding beta-lactamase established that these clones represented 34 independent stable 
vector integration events. A list of clones obtained by the methods of the present 

25 invention and their characteristics is provided below in Table 6 and Table 7. 

In addition to PHA inducible clones, Phobol 12-myristate 13-acetate (PMA) 
(Calbiochem), Thapsigargin (Thaps) (Calbiochem), and PMA + Thaps inducible clones 
were isolated using the general procedure set forth above using the indicated inducer 
rather than PHA. PMA is a specific activator of PKC (protein kinase C) and Thaps is a 
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specific activator of intracellular calcium ion release (Thaps). These clones were isolated 
using three rounds of FACS using the general procedures described for the PHA 
inducible clones in Example 5. In such instances, other stimulants were substituted for 
PHA. PMA was provided at 8 nM, Thaps was provided at l^M. When these two 

5 stimulants were combined, their concentration was not changed. As shown in Table 5, 
clones were selected based on their activation by PMA, Thaps. or PMA with Thaps after 
three or eighteen hours of stimulation ("stimulation time")- These results demonstrate 
that the FACS sorting cnteria can be varied depending upon the type of modulated clones 
desired. By using varied selection conditions, it is possible to isolate functionally distinct 

10 clones downstream ofthe desired signaling target. 

Example 7 Isolation of Jurkat BLEC integrated clones that show repressed 
expression of beta-lactamase upon activation 

Jurkat BLEC clones that exhibit decreased beta-lactamase expression upon 
,5 activation ofthe Jurkat cells by PHA were isolated by FACS sorting. These clones 
represent cells in which the BLEC trapping construct had integrated into a gene down 
regulated by PHA (T-cell) activation. Thus, these cells report the transcriptional 
repression of a gene upon cellular activation. Individual clones were identified and 
isolated by FACS using CCF2-AM to detect beta-lactamase activity using the following 

20 repressed sorting paradigm. 

A first sort was used to isolate a population of cells that constitutively express 
beta-lactamase by identifying and isolating a population of blue cells from an 
unstimulated population of BLEC transfected Jurkat cells contacted with CCF2-AM. 
The sorted population of cells represented 2.89% ofthe unsorted population. These cells 
25 were culmred, divided into two pools, and stimulated with one of two different stimuli, 
either lOug/ml PHA for 18 hours, or 8 nM PMA and 1/mM Thapsigargin for 18 hours. 
These stimulated cells were contacted with CCF2 (loading in the presence of 400 PET 
(4o/„ weight/volume) and Pluoronic 128 (lOO^/ml)) and the green cells in the population 
were sorted using FACS. The sorted population represented 8.41 % ofthe cell 
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population prior to the second sort. The third round of FACS was for single blue 
unstimulated cells. The population of cells obtained represented 18.2 % of the cell 
population prior to the third sort. 

This sorting procedure represents a 2,260-fold enrichment for PHA repressible 
clones. These clones have the beta-lactamase gene integrated into a gene that is down 
regulated by PHA stimulation of the cells. Six of 80 individual clones tested were 
repressed by PHA or PMA + Thapsigargin. All of these clones were confirmed to be 
independent integration events by genomic Southern analysis using a DNA probe 
encoding beta-lactamase. The results of these studies arc presented in Table 5. 

TABLES 

Identiflcation of trapping cell lines with reporter genes 
expression which is regulated by T-cell activation 


15 


Stimuli (Dose) 

First Sort 
Activation 
Chemical and 
Time of 
Exposure 

Stimulation 
Time 

Sorting 
Paradigm 

Ck>ne« 
UoUtctl 

Clones with One or 
Two Vector 
Insertion(s) 
1 2 

PHA (10i,g/ml) 

PHA 
1 8 hour? 

1 8 hours 

induced 


. 24 

10 

PMA (8 nM) + Thaps(l 

PMA + Thaps 
3 hours 

3 hours 

Induced 


1 

0 

PMA (8 nM) 

PMA 
3 hours 

3 hours 

Induced 

3 

2 

1 

Thaps(l^M) 

Thaps 
3 hours 

3 hours 

Induced 

2 

2 

0 

PHA (10 g/ml)or 
PMA (8nm) + Thaps (1 

No Stimulation 

18 hours 

Repressed 

U 

5 

I 


20 


Example 8 Specificity of T-cell modulated clones 

Isolated clones from PHA-induced (Example 6) and PHA-repressed (Example 7) 
procedures described above were characterized to determine the specificity of their 
modulation and time required for induction or repression. Clones w^ere stimulated with 
multiple activators or inhibitors over a one to twenty-four hour time interval. As shown 
in Table 6, five clones produced by the induced and repressed sorting paradigms using a 
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plurality of activators were tested for their responsiveness to a variety of T-cell activators, 
suppressors, and combinations thereof. 


TABLE 6 

Sorting protocols and specificity of activated BLEC Jurkat clones 


Clone 

Sorting Procedures 

Kclativc Bcta-Lactamasc Activitv of the Clone by the Indicated Stimulus 
After 24 hours (% of maximum activated stimuli) 



First Son 

Stimulus 

and 

(cell color 
sorted for) 

Second 
Sort 

Stimulus 
and 

(cell color 
sorted 

for) 

Third Sort 

Stimulus 

And 

(cell color 
sorted for) 

None 

PMA 
(8nM) 

Thaps 

(1*M) 

/ 

PMA 
(SniM) + 
Thaps 

dvM) 

/ 

PMA 

(8nM> + 

Thaps 

(VM)-^ 

CsA 

(100 

nM) 

PHA (10 1 PHA 
t^ml) 1 (lO^m) 

1 -t- ^ 
\ CsA 

! (100 

I nM) 

1 


Induced 

PHA' 
(blue) 

N/S 
(prccn) 

PHA 
(blue) 

0 

<1 

100 

5U 

<5 

60 

<5 

J32-6D4 

Induced 

PHA (blue) 

N/S 

(green) 

PHA 
(blue) 

0 

60 

1-2 

100 

70 

80 ! 7^ 

! 

C2 

N/S 

N/S 

N/S 

N/S 

0 

<I 

0 

lUO 

<t 

30 1 1 

J389. 
PT14 

Induced 

PMA" 
-t- 

Thaps' 
(blue) 

N/S 

(green) 

PMA + 

Thaps 

(blue) 

0 

90 

5 

85 

100 

85 

90 

J83 97- 
PPTR2 

Repressed 

N/S 
(blue) 

PMA 

Thaps 
(green) 

N/S 
(blue) 

0 

100 

85 

-50 

85 

67 

75 

J83- 
PT18 

Induced 

PHA (blue) 

N/S 

(green) 

PHA 
(blue) 

0 

80 

100 

2^; 

70 

60 1 60 
I 
1 


*'N/S" means **no stimulation' 


* concentration of PHA used was l^g/ml. 
10 ^ concentration of PMA used was 8 nM. 

concentration of Thaps used was 1^*aM 

In this study, PMA, which is a PKC activator, Thapsigargin which increases 
intracellular calcium, PHA which activates the T-cell receptor pathway, and cyclosporin 
15 A which is a clinically approved immunosuppressant that inhibits the Ca'" dependent 
phosphates calcineurin were investigated for their ability to modulate beta-lactamase 
expression in PHA induced and repressed BLEC clones. 

The selected clones show varied dependence for their activation and inhibition by 
these activators and inhibitors which give and indication of the signaling events required 
20 for their transcriptional activation. Five of the listed clones were generated using the 
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approaches described above in Example 6. The clone C2 was generated using a more 
classical approach. This clone was generated by transfecting a plasmid construct in which 
a 3X NFAT response element has been operably linked to beta-lactamase expression. 
This 3XNFAT element represents a DNA sequence that is present in the promoter region 
5 of IL-2 and other T-cell activated genes. In addition the C2 cell line has been stably 

transfected with the MI muscurinic receptor. This allows the activation of beta-lactamase 
expression in this clone using an Ml-muscurinic agonist such as carbachol. This cell line 
therefore represents a good control for the cellular activators and inhibitors tested as the 
signaling events required for its activation are established. 

10 The results of these studies indicate that the cell lines generated vary in their 

specificity towards activation or repression by activators. Thus, depending on the type of 
system that these cells are to be used to investigate, a panel of clones with varying 
specificity towards a specific pathway are made available by the present methods. 

Table 7 and Table 8 provide data similar to that provided in Table 5 for all of the 

15 clones obtained by the methods of Examples 5 to 7. 


TABLE 7 


Characterization of induced BLEC Jurkat clones 




Change in 460/530 ratio in the indicaled clone 
by the following activator 

CLONE 
Number 

TIME 
(hours) 
for first 
detectable 
change in 
color 

PHA 

(lO^g/ml) 

Thaps 

PMA 
(8 nM) 

PMA 
(8 nM> 
+ Thaps 

Anti-CD3 
(Pharmingen) 

J325B5 

6 

7 

Nt 

2-3 

Nt 

4-5 

J325Bn 

6 

9 

1-2 

2-3 

Nl 

5-6 

J325E3 

6 

7 

Nt 

2-3 

Nt 

4-5 

J325G4 

6 

3-4 

Nt 

3^ 

Nt 

4-5 

J325E6 

6 

11 

Nl 

3-4 

Nt 

6 

J326C9 

6 

4-5 

1-2 

2-3 

Nl 

3-4 

J325E1 

<2 

8 

Nt 

8 

Nl 

5-6 

J326D4 

<2 

to 

0 

10 

Nl 

5-6 

J326D7 

<2 

10 

Nt 

10 

Nl 

5-6 

J326F7 

<2 

10 

Nt 

10 

Nl 

5-6 

J326H4 

<2 

10 

Nl 

10 

Nl 

5-6 

J83Pil 

Nt' 

3-4 

3-4 

3-4 

4-5 

2-3 

J83P12 

5-6 

8 

1-2 

7-8 

7-8 

3-4 

J83PI8 

5-6 

4-5 

1-2 

4-5 

4-5 

2-3 

J83P13 

5-6 

5-6 

6-7 

3-4 

5-6 

2-3 

J83PI4 

4-6 

3-4 

3-4 

0 

2-3 

2 
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Chanfje in 460/530 ratio in the indicated clone 
by (he following activator 

CLONE 
Number 

TIME 
(hours) 
for first 
detectable 
change in 
color 

PHA 

(lOug/ml) 

Thap5 

/ 

PMA 

PMA 
(8jiM) 
^ Thaps 

/ 

Anti-CD3 
(Pharmingen) 

J83PI6 

6-18 

6-7 

7-8 

0 

4-5 

4 

J83PI9 

6 

6 

5-6 

0 

4-5 

3-4 

J83PI5 

Nl 

Nt 

Nt 

Nt 

Nt 

Nl 

J83PI7 

6-18 

2 

2 

2 

2 

1.5-2 

J83Pn5 

Nl 

3-4 

2 

3-4 

3-4 

3-4 

J83Pn6 

Nt 

3-4 

1-2 

3-4 

3-4 

2-3 

J83Pn8 

Nt 

5-6 

7-8 

5 

Nt 

Nt 

J83PI12 

Nt 

Nt 

Nt 

Nt 

Nt 

Nt 

J83Pn4 

Nt 

2 

2 

2 

Nt 

Nt 

J83Pn7 

Nl 

Nl 

Nt 

Nl 

Nl 

Nl 

J83PII9 

Nl 

5-6 

1-2 

3 

1-2 

1-2 

J83P11 1 

Nl 

Nl 

Nt 

Nt 

Nt 

Nl 

J83Pn3 

Nl 

2-3 

2-3 

0 

Nl 

Nt 

J97P)1 

Nt 

3-4 

3-4 

3-4 

3-4 

3-4 

J97PI2 

Nl 

2-3 

Nl 

Nt 

2-3 

Nt 

J97PI3 

Nl 

1-2 

1-2 

1-2 

1-2 

Nt 

J97P14 

Nl 

1-2 

1-2 

1-2 

1-2 

Nt 

J97PI5 

Nt 

L5 

t.9 

1.5 

2-3 

Nt 

J97P36 

Nt 

3-a 

4-6 

1-2 

4-6 

Nl 

J97PI13 

Nt 

2-3 

5-6 

1-2 

4.5 

Nl 

J97PI18 

Nt 

1-2 

3-4 

1-2 

4-5 

Nt 

J97PI7 

Nl 

3-4 

4-5 

1-2 

5-6 

Nl 

J97Pn7 

Nt 

4-5 

7-8 

1-2 

8-10 

Nl 

J97PI8 

Nt 

2.5-3 

3-4 

1-2 

3-4 

Nt 

J97PI9 

Nt 

2-3 

4-5 

1-2 

5-6 

Nt 

J97PI10 

Nl 

3-4 

3-4 

1-2 

4-5 

Nt 

J97Pr23 

Nl 

4-5 

4-5 

1-2 

4-5 

1-2 

J97PIII 

Nt 

3-4 

5-6 

2 

4-5 

Nt 

J97Pn5 

Nl 

1-2 

3-4 

1-2 

3-4 

Nt 

391PM2 

Nt 

3-4 

5-6 

2-3 

5-6 

Nt 

J97Pi22 

Nt 

5-6 

5-7 

2-3 

3^ 

3-4 

J97Pn4 

Nt 

4-5 

3-4 

2 

4-5 

Nt 

J97PI116 

Nl 

2-3 

3-4 

2-3 

4 

Nl 

J97P1I9 

Nl 

2-3 

2-3 

1-2 

2-4 

Nt 

J97PI20 

Nl 

1-2 

2-3 

1-2 

1-2 

Nl 

i97P121 

Nt 

2-3 

2-3 

1-2. 

2-3 

2-3 

J97P124 

Nt 

3-4 

3-4 

2-3 

7-10 

3-4 

J389PTI 

2 hours 

5-6 

3-4 

8-9 

8-9 

3-4 

J389PT4 

Ihour 

15 

10 

12 

16 

15 

J389PM2 

ihour 

4-5 

3^ 

3-4 

4-5 

4-5 

J389PM3 

Ihour 

3-4 

2-3 

2-3 

3^ 

3^ 

J389PM5 

Ihour 

4-5 

3^ 

3-4 

4.5 

4-5 

J389PM7 

3hours 

1-2 

2-3 

1-2 

1-2 

1-2 

J389PM8 

2-3hours 

2-3 

3^ 

2-3 

2-3 

3-4 

J389TI1 

3-5hours 

1-2 

2-3 

1-2 

2-3 

2-3 

J389TI4 

2hour 

0 

3-4 

1-2 

2-3 

0 


"Nt" means "not tested" 
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TABLE 8 

Characterization of repressed BLEC Jurkat clones 



Relative repression of beta>lactamase in the indicated clone by the following activator 

clone: # 

PHA 

PHA 

PMA 

PMA 



(10 g/ml) 

+ r 

(8 ntVfll) 

+ 

(8 nM) 



CsA 

Thaps 

Thaps 



(100 nM) 

(1^) 

(^) 





CsA 





(100 nM) 

J83/97pptrl 

90 

90 

75 

75 

J83/97 pptr2 

to 

-60 

10 

-80 

J83/97pptr3 

10 

-50 

10 

-100 

J83/97pptr4 

60 

60 

40 

70 

J83/97pptr5 

50 

60 

50 

50 

J83/97pptr6 

70 

70 

70 

70 


5 

To confirm that changes in reporter gene activity reflected changes in mRNA 
expression in these clones. Northern analysis was performed on induced, constitutive, and 
repressed clones using a radio labeled DNA probe directed towards the beta-Iactamase 
gene. All clones that had beta-lactamase enzyme inducibility tested showed beta- 

10 lactamase mRNA inducibility. All clones that showed constitutive expression of beta- 
lactamase showed constitutive expression of beta-lactamase mRNA. All clones that 
showed repressed beta-lactamase expression showed repressed beta-lactamase mRNA. 
The message size of the control beta-lactamase mRNA was about 800 base pairs. The 
sizes of some from othei^lactamase clones of the RNA were shifted higher in the gel, 

15 indicating a fusion RNA had been made between the endogenous transcript and beta- 
lactamase . Two knovra genes, CDK-6 (isolated from clone J83-PTI1) and Erg-3 
(isolated from clone J89-PTI4), and two unknown genes were identified, which were 
isolated from clones J83PI15 and J83PI2, respectively. For clone J389-PTI4, a Northern 
blot was performed with the Erg-3 probe made using appropriate PGR primers 

20 determined from a published sequence which hybridizes with both the fusion RNA and 
the wild type RNA (for the sequence of Erg-3 see Stamminger et al.. Int. Immunol. 5:63- 
70 (1993); for PGR methodologies, see U.S. Patent Nos: 4,800,159, 4,683,195, and 


SDOCrD: <WO 9813353A1_I_> 


wo 98/13353 PCTAJS97/17395 

4,683,202). The inducibility in wild type Jurkat cells mimicked the beta-lactamase 
activity in this clone. 

Example 9 Screening of a library of known pharmacologically active modulators 
5 using a T-cell activated BLEC clone 

T-cell clone J32-6D4 was used to identify potential inhibitors of the T-cell 
receptor pathway. This clone was selected for further study because it is difficult to 
identify chemicals that inhibit specific T-cell receptor pathway. Thus, this clone was 
used to identify chemicals that inhibit this T-cell receptor pathway that is also stimulated 

1 0 by the PKC activator PMA. 

A first screen was performed using a generic set of 480 chemicals with known 
properties. The chemicals in this set were known to have pharmacological activity. 
Approximately one percent (7/480) of these chemicals showed greater than 50% 
inhibition of the PHA activation of beta-lactamase expression in clone J32-6D4 when 

15 tested in duplicate at lO/i^M of chemical. Cells were activated with J^g/ml of PHA for 18 
hours in the presence of test chemicals to test for inhibitory activity. The seven chemicals 
that specifically inhibited clone J32-6D4 are shown in Table 9. Two of these chemicals 
specifically inhibited clone J32-6D4 and not the control C2 cell line. This assay for the 
specificity of inhibition included screening these 480 chemicals for inhibitory activity 

20 using clone C2, in which the Ml muscarinic receptor was linked to a NT AT beta- 
lactamase reporter gene readout (see Example 7). In these experiments, the inhibition 
measured was the inhibition of carbachol induced expression of beta-lactamase. These 
results, the specific inhibition of J32-6D4 cells but not C2 cells, show that the chemicals 
are not toxic, do not inhibit general transcription, and do not inhibit the reporter gene 

25 product. 


TABLE 9 

30 Active chemicals identified as exhibiting inhibitory activity of PHA activation of clone J32- 

6D4 
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Chemical 
(lOJVI) 

% Inhibit! n of 
PHA activati n of 
Clone J32-6D4 

Inhibition f 
CI neC2 

Therapeutic 
Category of the 
Chemical 

Digoxin 

86 

+ 

Cardiotonic 

Digitoxin 

77 

-t- 

Cardiotonic 

Gentian 
Violet 

73 


Topical anti-infcctive 

Oxyphenbuta 
zone 

75 

- 

Ami -inflammatory 

Mechloretha 
mine 

51 


Anti-neoplasric 

Dipyrithione 

70 


Anti-bacterial 

Ouabain 

50 

+ 

Cardiotonic 

Tliioguanine 

50 

+ 

Anti-neoplastic 


Example 10 Screening a library of structurally characterized chemicals having 
unknown pharmacological properties for modulating activity of the T-cell receptor 
5 pathv*^ay using a T-cell activated BLEC clone 

Having demonstrated in Example 9 that clone J32-6D4 perfomis robustly in a 
chemical screen, this clone was used to screen an additional 7,500 chemicals from a 
proprietary chemical library at a concentration of l^^^M per chemical. This collection of 
chemicals, unlike the collection of chemicals used in Example 9, contains chemicals 

10 without known pharmacological activity. Seventy-seven chemicals showed at least 50% 
inhibition of PHA activation of beta-lactamase expression following the general 
procedures set forth in Example 7. These 77 chemicals were re-tested for this activity 
using the same procedure and 31 chemicals were confirmed to have activity. The IC50 
values of the inhibition of PHA activation of beta-lactamase expression were determined 

15 for these 31 chemicals using concentrations of chemical between about 20^,iaM to 2 nM. 
IC50 values reflect the concentration of a chemical needed to inhibit the PHA activation 
of the clone by 50% and were determined using known methods. These 31 chemicals 
were also tested for their cross inhibition of cabachol induced activation of beta- 
lactamase expression of clone C2 as described in Example 8. 

20 Two chemicals, designated chemical A and chemical B, exhibited an IC50 values 

of about 200 nM and specifically inhibited the PHA activation of beta-lactamase 
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expression of clone J32-6D4 but not the carbachol activation of clone C2 at the 
concentration tested. All of the other 31 chemicals either inhibited both clone J32-6D4 
and clone C2, or had IC50 values above IjjM- 


H3C ^^^3 

Chemical A Chemical B 



Chemicals A and B were further tested for their anti-proliferative effect on Jurkat 
cells and mouse L-cells (mouse fibroblast cell line). Chemical B showed no anti- 
proliferative effect on both the Jurkats and L-cells at concentrations up to 

15 Chemical A exhibited an anti-proliferative effect on the Jurkats and L-cells at 100 nM. 
Proliferation assays were performed by seeding about 20,000 cells unaciivated by PHA 
into a 24 well plate. These cells were contacted with chemicals and were then incubated 
at 37°C for five days. The cells were contacted with 10/(g/ml of MTT (Sigma Chemical 
Co., MO) for three hours. The cells were then collected, resuspended in isopropanol, and 

20 the absorbance was read in a plate reader at a wavelength of 570 nM with a background 
subtraction at a reading at a wavelength of 690 nM (see, Carmichael et al., Cancer Res. 
47:936 (1987)). 

Example 11 Effects of identified chemicals on primary human T-cell proliferation. 

25 An assay was developed to test the chemicals identified in Example 9 for their 

ability to inhibit the activation and proliferation of normal peripheral white blood cells to 
confirm their presumptive activity (see generally, Harlow and Lane, Antibodies. A 
Laboratory Manual Cold Spring Harbor Press, (1988)). Peripheral blood fi-om nonmal 
humans was drawn into heparanized Vacutainert® tubes and incubated with various 

30 concentrations of (superantigen) staphylococcal enterotoxin B (SEB, at 0.001 to 10 

ng/ml) for 1 hour at 37°C. Brefeldin A, which was added and the cells were incubated an 
additional 5 hours. EDTA was added to detach the cells, and a lOOjti^ aliquot was 
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removed, the red blood cells lysed with ammonium chloride, the remaining cells counted 
and their viability determined using viability staining using known methods. The red 
blood cells remaining in the original sample were lysed with ammonium chloride and the 
remaining cells (leukocytes) were permeabilized with FACS permabilizing solution using 

5 established methods. These leukocytes were harvested by centrifugation, washed and 
stained with the combination of antibodies CD69, IFN-^and CD3, which were detectably 
labeled. Control cells consisted of cells incubated in the absence of SEB and staining 
control cells consisted of cells stained with CD69/MsIgGl and CD3 antibodies, which 
were detectably labeled. Similar cultures will be incubated for 71 hours, pulsed with 

10 tridiated thymidine for I hour and harvested and the incorporated radioactivity counted 
by scintillation to determine a stimulation index using established methods. 

Using preferred concentrations of SEB, various concentrations of cyclosporin A 
(CsA) were added to determine optimal conditions of CsA for blocking of SEB 
stimulation of peripheral blood T-cells for use as a control for non-proliferative T-cells. 

15 Controls consisted of cells incubated with culture media in place of CsA. Control 

cultures incubated for 1 hour were blocked with Brefeldin A for an additional 5 hours, 
harvested, and stained for intracellular IFN-^or cultured for an additional 71 hours, 
pulsed with tritiated thymidine for one hour, harvested, and counted by liquid 
scintillation. 

20 Using preferred concentrations of SEB and CsA, blood from normal donors was 

stimulated in the presence and absence of CsA. This established expected normal ranges 
for the degree of activation (% IFN-^+ activated CD3+ cells for 6 hours), proliferation 
(^H-TdR uptake at 72 hours) and CsA blocking at both time points. 

Using preferred conditions, human blood was incubated with Chemical A or 

25 Chemical B at 2, 20, and 200 nM. CsA was used as a positive control for T-cell 

suppression. One hour cultures were blocked with Brefeldin A for an additional 5 hours, 
harvested and counted by liquid scintillation. Cell counts and percent viability were 
reported for each culture condition. 
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The results of these studies should demonstrate that at least one of the chemicals 
identified by the methods of the present invention have the predicted pharmacological 
activity in human cells. 

5 Example 12: Identification of genes expressed during developmental programs. 

Another use of this method is for the identification of genes expressed during 
various cellular processes, such as developmental biology and apoptosis. Genes involved 
in specific developmental programs, such as the differentiation of pre-adiposites to 
mature adiposites, can be identified using this method. 
10 In order to practice this method, a clone library from a pre-adiposite cell line such 

as 3T3-L1 is made using the methods generally described in Examples 10 to 12 above. 
Of course, pre-adiposite cells are used rather than Jurkat cells. This cell line can be 
reversible differentiated to mature adiposites by exposing them to dcxamethasone and 
indomethasone (see. Hunt et al. Proc. Natl. Acad . Sci. U.S.A. 83:3786-3789 (1986)). 
15 These mature adiposites can be reversibly differentiated to prc-adiposites with Tumor 
Necrosis Factor alpha TNFa (see, Torti et al. J. Cell. Biol. 108:1105-1 113 (1989)). Thus, 
a cell library capable of signaling the expression of genes involved in cellular 
differentiation can be made. 

The 3T3-L1 gene trap library is FACS sorted to remove blue constitutively 
20 expressing beta-lactamase cells. The remaining green cells are then differentiated into 
mature adiposites using the dexamethasone and indomethasone. Blue (beta-lactamase 
expressing) cells are isolated using FACS. These clones represent cells in which the 
trapping construct integrates into a gene that is expressed in differentiated adiposites, but 
not in undifferentiated adiposites. This process can be repeated multiple times to insure 
25 enrichment for cells that express adiposite specific genes. 

Alternatively, cell clones can be isolated which are differentiated for a specific 
time interval. For instance, blue and green cells differentiated for 2 days with 
dexamethasone and indomethasone are sorted. These populations of cells represent cells 
in which the trapping construct integrates into a gene that is expressed early in the 
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differentiation process. This allows the identification of genes that are expressed during 
the developmental program but are not expressed in pre-adiposites or mature adiposites. 
This method can be used to isolated genes expressed during a variety of developmental 
programs, including but not hmited to neuronal, cardiac, muscle, and cancer cells. 

These cells hnes can be used to identify genes involved in the differentiation 
process, and can also be used to screen chemicals that modulate the differentiation 
process using the methods described in Examples 8 to 10 above. Drugs that can be 
identified include those that enhance the growth of cells, such as neuronal cells, or 
depress the growth or reverse differentiation of cells, such as cancer cells. 


Example 13: Assays for modulators of G-protein coupled receptors 

The general procedures of Examples 8 to 10 can be used in an analogous manner 
to identify cell lines suitable for screens for G-protein coupled receptors (GPCRs). 
GPCRs are known to signal via one of several intracellular pathways. These pathways 
can be activated pharmacologically in cell libraries to yield potential screening cell lines. 
For example, Gq coupled GPCRs are known to raise intracellular free calcium via 
activation of phospholipase Cb (PLCb). By isolating cell lines responsive to an increase 
in calcium from the genomic library (e.g. induced by ionomycin or thapsigargin), screen 
cell lines are generated. 

For example, a calcium-sensitive clone was transfected with a Gq-type GPCR by 
electroporation. Cells from clone J389PTI4 were transfected by electroporation with a 
plasmid (pcDNAS (Invitrogen) or pcDNA3-Ml (pcDNA3 that can operably express Ml 
receptor) to make cell lines J389PTI4/pcDNA3 and J389PTI4/pcDNA3-Ml). Cell line 
J389PTI4/pcDNA3-Ml expressed the Ml receptor, whereas the cell line 
J389PTI4/pcDNA3 did not. Thus, the J389PTI4/pcDNA3 cell is a control cell. Two days 
after transfection, cells were stimulated with 20^^ carbachol in 96-well microliter plate 
for 6 hours in 37 **C. These cells were contacted with CCF-2 dye for another 90 minutes. 
The 460/530 ratio changes were measured in a Cytoflour (Series 4000 Model) (Perceptive 
Biosystems) fluorescence plate reader and correspond to reporter gene expression. These 
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results are summarized in Table 10. The ability of the transiently-transfected clone to 
detect a ligand for the GPCR demonstrates the potential of generating screening cell lines 
using clones made following the procedures of the present invention. The stimulation by 
carbachol detected in the transient tranfection assay represents a response in about 20% of 
5 the cells. To develop a stable screening cell line for the Ml receptor, this population can 
be sorted for individual clones responsive to carbachol and those clones can be expanded 
and screened to identify the most responsive clones. 

Similar methods can be used to generate cell lines for Gs or Gi-coupled receptors. 
In these cases, clones responsive to increases or decreases in cAMP can be isolated. A 
10 vanety of cell lines can be used for these procedures, such as CHO, HEK293, 
Neuroblastoma, PI 9, Fl 1, and NT-Z cells. 


TABLE 10 

Cell lines that report modulation of the Ml receptor pathway 



Relative expression of beta-lactamase in cells 
Exposed to the indicated stimuli 

Cell Line 

Unstimulated 

30kM Carbachol 

/ 

10 nM PHA 

J389PTI4/pcDNA3 

i 

1 

12 

J389PTI4/pcDNA3-Ml 

1 

4 

13 


15 
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SEQUENCE ID. LISTING 


SEQ.ID NO. 1: range 1 to 795 


10 


20 30 40 50 


10 


15 


20 


25 


35 


40 


45 


■«■ W -"T - " - 

ATG AGT CAC CCA GAA ACQ CTG GTG AAA GTA AAA GAT GCT GAA GAT CAG TTG 
Met Ser His Pro Glu Thr Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu 
60 70 80 90 100 

****** 
GGT GCA CGA GTG GGT TAC ATC GAA CTG GAT CTC AAC AGC GGT AAG ATC CTT 
Gly Ala Arg Val Gly Tyr He Glu Leu Asp Leu Asn Ser Gly Lys He Leu 
110 120 130 140 150 

^ ★ ** * ** * ** 

GAG AGT TTT CGC CCC GAA GAA CGT TTT CCA ATG ATG AGC ACT TTT AAA GTT 
Glu Ser Phe Arg Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val 
160 170 180 190 200 

********** 
CTG CTA TGT GGC GCG GTA TTA TCC CGT GTT GAC GCC GGG CAA GAG CAA CTC 
Leu Leu Cys Gly Ala Val Leu Ser Arg Val Asp Ala Gly Gin Glu Gin Leu 
210 220 230 240 250 

********* 
GGT CGC CGC ATA CAC TAT TCT CAG AAT GAC TTG GTT GAG TAC TCA CCA GTC 
Gly Arg Arg He His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val 
260 270 280 290 300 


ACA GAA AAG CAT CTT ACG GAT GGC ATG ACA GTA AGA GAA TTA TGC AGT GCT 
Thr Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala 
30 310 320 330 340 350 

********** 
GCC ATA ACC ATG AGT GAT AAC ACT GCG GCC AAC TTA CTT CTG ACA ACG ATC 
Ala He Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr He 
360 370 380 390 400 

********** 
GGA GGA CCG AAG GAG CTA ACC GCT TTT TTG CAC AAC ATG GGG GAT CAT GTA 
Gly Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp His Val 
410 420 430 440 450 


ACT CGC CTT GAT CGT TGG GAA CCG GAG CTG AAT GAA GCC ATA CCA AAC GAC 

Thr Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He Pro Asn Asp 

460 470 480 490 500 510 

** * ** *.** * ** 

GAG CGT GAC ACC ACG ATG CCT GCA GCA ATG GCA ACA ACG TTG CGC AAA CTA 

Glu Arg Asp Thr Thr Met Pro Ala Ala Met Ala Thr Thr Leu Arg Lys Leu 

520 530 540 550 560 


* 


TTA ACT GGC GAA CTA CTT ACT CTA GCT TCC CGG CAA CAA TTA ATA GAC TGG 
Leu Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu He Asp Trp 
50 570 580 590 600 610 

******** ** 
ATG GAG GCG GAT AAA GTT GCA GGA CCA CTT CTG CGC TCG GCC CTT CCG GCT 
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Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro Ala 
620 630 640 650 660 

* ♦ ★* ★ ** *** 

GGC TGG TTT ATT GCT GAT AAA TCT GGA GCC GGT GAG CGT GGG TCT CGC GGT 

5 Gly Trp Phe He Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser Arg Gly 

670 680 690 700 710 

* ***** *** ♦ 

ATC ATT GCA GCA CTG GGG CCA GAT GGT AAG CCC TCC CGT ATC GTA GTT ATC 

He He Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg He Val Val He 

10 720 730 740 750 760 

** ♦ ** 

TAC ACG ACG GGG AGT CAG GCA ACT ATG GAT GAA CGA AAT AGA CAG ATC GCT 
Tyr Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn Arg Gin He Ala 
770 780 790 


15 


GAG ATA GGT GCC TCA CTG ATT AAG CAT TGG 
Glu He Gly Ala Ser Leu He Lys His Trp 


20 

SEO.ID NO. 2: range 1 to 858 

10 20 30 40 50 

* *** *★ *** * 

25 ATG AGA ATT CAA CAT TTC CGT GTC GCC CTT ATT CCC TTT TTT GCG GCA TTT 

Met Arg He Gin His Phe Arg Val Ala Leu He Pro Phe Phe Ala Ala Phe 

60 70 80 90 100 

** * ** * * * 

TGC CTT CCT GTT TTT GGT CAC CCA GAA ACG CTG GTG AAA GTA AAA GAT GCT 
30 Cys Leu Pro Val Phe Gly His Pro Glu Thr Leu Val Lys Val Lys Asp Ala 
110 120 130 140 150 

* * ★* 

GAA GAT CAG TTG GGT GCA CGA GTG GGT TAC ATC GAA CTG GAT CTC AAC AGC 
Glu Asp Gin Leu Gly Ala Arg Val Gly Tyr He Glu Leu Asp Leu Asn Ser 
35 160 170 180 190 200 

* ★* ★ * *♦ * 

GGT AAG ATC CTT GAG AGT TTT CGC CCC GAA GAA CGT TTT CCA ATG ATG AGC 

Gly Lys He Leu Glu Ser Phe Arg Pro Glu Glu Arg Phe Pro Met Met Ser 

210 220 230 240 250 

40-** ♦ ** * ** * ♦* 

ACT TTT AAA GTT CTG CTA TGT GGC GCG GTA TTA TCC CGT GTT GAC GCC GGG 
Thr Phe Lys Val Leu Leu Cys Gly Ala Val Leu Ser Arg Val Asp Ala Gly 
260 270 280 290 300 

* *•* * **.* ** ♦ 

45 CAA GAG CAA CTC GGT CGC CGC ATA CAC TAT TCT CAG. AAT GAC TTG GTT GAG 

Gin Glu Gin Leu Gly Arg Arg He His Tyr Ser Gin Asn Asp Leu Val Glu 
310 320 330 340 350 

i, -k * *★ * ** ★ ★ 

TAC TCA CCA GTC ACA GAA AAG CAT CTT ACG GAT GGC ATG ACA GTA AGA GAA 

50 Tyr Ser Pro Val Thr Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu 

360 370 380 390 400 

* * * ** *** 

TTA TGC AGT GCT GCC ATA ACC ATG AGT GAT AAC ACT GCG GCC AAC TTA CTT 
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10 


15 


20 


25 


30 


35 


40 


LiCU 

Cvs 

Ser 

Ala 

Ala 

He 

Thr 

Met 

Ser 

Asp 

Asn 

Thr 

Ala 

Ala 

Asn 

Leu 

Leu 

4 i U 


420 


430 



440 



450 



-*• 



* 


* 


★ 

* 


* 


* 

* 


* 


CTG 

ACA 

ACG 

ATC 


GGA 

CCG 

AAG 

GAG 

CTA 

ACC 

GCT 

TTT 

TTG 

CAC 

AAC 

ATG 


Thr 

Thr 

He 

Glv 

Gly 

Pro 

Lys 

Glu 

Leu 

Thr. 

Ala 

Phe 

Leu 

His 

Asn 

Met 

46 0 



470 



480 


490 



500 



51C 

* 

♦ 





* 


4- 



* 


* 


* 

■* 

GGG 

GAT 

CAT 

GTA 


CGC 

CTT 


CGT 

TGG 

GAA 

CCG 

GAG 

CTG 

AAT 

GAA 

GCC 

Gly 

Asp 

His 

Val 

Thr 

Arg 

Leu 

Asp 

Arg 

Trp 

Glu 

Pro 

Glu 

Leu 

Asn 

Glu 

Ala 

520 



530 



540 


550 



560 




* 

* 




* 

★ 


* 


* 

* 


* 

ATA 

CCA 

AAC 



CGT 

GAC 


ACG 

ATG 

CCT 

GCA 

GCA 

ATG 

GCA 

ACA 

ACG 

X ± c 

XT I. ' 

Asn 

Asp 

) 

Glu 

Arg 

Asp 

Thr 

Thr 

Met 

Pro 

Ala 

Ala 

Met 

Ala 

Thr 

Thr 



57C 


5B0 



590 



600 


610 


★ 

♦ 


* 


-* 



•* 


* 

* 


* 



TTG 

CGC 

AAA 

CTA 


ACT 

GGC 


CTA 

CTT 

ACT 

CTA 

GCT 

TCC 

CGG 

CAA 

CAA 


Arg 

Lys 

Leu 

Leu 

Thr 

Gly 

Glu 

Leu 

Leu 

Thr 

Leu 

Ala 

Ser 

Arg 

Gin 

Gin 


620 



630 


640 



650 



660 

* 


* 


* 

* 


* 


* 

* 


* 


* 

* 


TTA 

ATA 

GAC 


ATG 

GAG 

GCG 

GAT 

AAA 

GTT 

GCA 

GGA 

CCA 

CTT 

CTG 

CGC 

TCG 

Leu 

He 

Asp 

Tro 

Met 

Glu 

Ala 

Asp 

Lys 

Val 

Ala 

Gly 

Pro 

Leu 

Leu 

Arg 

Ser 


670 



680 



690 


700 



710 


* 


* 

-* 




* 

★ 


* 


-* 

* 


* 


GCC 

CTT 

CCG 

GCT 

GGC 

TGG 

TTT 

ATT 

GCT 

GAT 

AAA 

TCT 

GGA 

GCC 

GGT 

GAG 

CGT 


Leu 

Pro 

Ala 

Gly 

Trp 

Phe 

He 

Ala 

Asp 

Lys 

Ser 

Gly 

Ala 

Gly 

Glu 

Arg 


720 


730 



740 



750 


760 


* 

•*• 


★ 


* 

★ 


♦ 



* 


* 



* 

GGG 

TCT 

CGC 

GGT 

ATC 

ATT 

GCA 

GCA 

CTG 

GGG 

CCA 

GAT 

GGT 

AAG 

CCC 

TCC 

CGT 

Gly 

Ser 

Arg 

Gly 

He 

He 

Ala 

Ala 

Leu 

Gly 

Pro 

Asp 

Gly 

Lys 

Pro 

Ser 

Arg 

770 


780 


790 



800 



810 



* 


* 

★ 


* 


* 

* 


★ 


* 

-* 


* 

ATC 

GTA 

GTT 

ATC 

TAC 

ACG 

ACG 

GGG 

AGT 

CAG 

GCA 

ACT 

ATG 

GAT 

GAA 

CGA 

AAT 

He 

Val 

Val 

He 

Tyr 

Thr 

Thr 

Gly 

Ser 

Gin 

Ala 

Thr 

Met 

Asp 

Glu 

Arg 

Asn 

820 



830 



B40 


850 







*■ 



* 



* 


* 


* 

* 





AGA 

CAG 

ATC 

GCT 

GAG 

ATA 

GGT 

GCC 

TCA 

CTG 

ATT 

AAG 

CAT 

TGG 




Arg 

Gin 

He 

Ala 

Glu 

He 

Gly 

Ala 

Ser 

Leu 

He 

Lys 

His 

Trp 





SEQ.ID NO. 3: range 1 to 795 


45 AAGCTTTTTGCAGAAGCTCAGAATAAACGCAACTTTCCGGGTACCACC 

10 20 30 40 50 

** * * * * ♦* 

* 

ATG GGG CAC CCA GAA ACG CTG GTG AAA GTA AAA GAT GCT GAA GAT CAG TTG GGT 
50 GCA 

Met Gly His Pro Glu Thr Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly 
Ala 

60 70 ' 80 90 100 
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♦ ★ ** * ** **♦ 

CGA GTG GGT TAG ATC GAA CTG GAT CTC AAC AGC GGT AAG ATC CTT GAG AGT 

Arg Val Gly Tyr lie Glu Leu Asp Leu Asn Ser Gly Lys lie Leu Glu Ser 
110 120 130 140 150 

** * ** * * 

TTT CGC CCC GAA GAA 'CGT TTT CCA ATG ATG AGC ACT TTT AAA GTT CTG CTA 

Phe Arg Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu Leu 

160 170 180 190 200 210 

tt * ** ♦ ** * ** 

10 TGT GGC GCG GTA TTA TCC CGT GAT GAC GCC GGG CAA GAG CAA CTC GGT CGC 
Cys.Gly Ala Val Leu Ser Arg lie Asp Ala Gly Gin Glu Gin Leu Gly Arg 
220 230 240 250 260 

CGC ATA CAC TAT TCT CAG AAT GAC TTG GTT GAG TAC TCA CCA GTC ACA GAA 

15 Arg He His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Glu 
270 280 ' 290 300 310 

•t, * ★ *** * 

AAG CAT CTT ACG GAT GGC ATG ACA GTA AGA GAA TTA TGC AGT GCT GCC ATA 

Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala Ala He 
20 320 330 340 350 360 

* * *★ ♦ ♦* ♦ ** 

ACC ATG AGT GAT AAC ACT GCG GCC AAC TTA CTT CTG ACA ACG ATC GGA GGA 

Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr He Gly Gly 

370 380 390 400 410 

25* ** * * *♦ * 

CCG AAG GAG CTA ACC GCT TTT TTG CAC AAC ATG GGG GAT CAT GTA ACT CGC 

Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp His Val Thr Arg 

420 430 440 450 460 

it * * ** * *♦ * ** 

30 CTT GAT CAT TGG GAA CCG GAG CTG AAT GAA GCC ATA CCA AAC GAC GAG CGT 

Leu Asp His Trp Glu Pro Glu Leu Asn Glu Ala He Pro Asn Asp Glu Arg 
470 480 490 500 510 

« ★* * ** * * 

GAC ACC ACG ATG CCT GTA GCA ATG GCA ACA ACG TTG CGC AAA CTA TTA ACT 

35 Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu Leu Thr 
520 530 540 550 560 

** * ♦* * * 

GGC GAA CTA CTT ACT CTA GCT TCC CGG CAA CAA TTA ATA GAC TGG ATG GAG 
Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu He Asp Trp Met Glu 
40 570 580 590 600 610 

GCG GAT AAA GTT GCA GGA CCA CTT CTG CGC TCG GCC CTT CCG GCT GGC TGG 

Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro Ala Gly Trp 
620 630 640 650, 660 

45* ★* ♦ * ** * 

TTT ATT GCT GAT AAA TCT GGA GCC GGT GAG CGT GGG TCT CGC GGT ATC ATT 
Phe He Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser Arg Gly He He 
670 680 690 700 710 720 

*** 

50 GCA GCA CTG GGG CCA GAT GGT AAG CCC TCC CGT ATC GTA GTT ATC TAC ACG 
Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg He Val Val He Tyr Thr 
730 740 750 760 770 
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ACG GGG AGT CAG GCA ACT. ATG GAT GAA CGA AAT AGA CAG ATC GCT GAG ATA 

Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn Arg Gin He Ala Glu He 
780 "790 

♦ * * * * 

GGT GCC TCA CTG ATT AAG CAT TGG 

Gly Ala Ser Leu He Lys His Trp 


SEQ.ID NO. 4: range 1 to 792 


15 10 


20 30 40 50 


ATG GAC CCA GAA ACG CTG GTG AAA GTA AAA GAT GCT GAA GAT CAG TTG GGT 
Met Asp Pro Glu Thr Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly 
60 70 80 90 100 

* 

GCA CGA GTG GGT TAC ATC GAA CTG GAT CTC AAC AGC GGT AAG ATC CTT GAG 
Ala Arg Val Gly Tyr lie Glu Leu Asp Leu Asn Ser Gly Lys lie Leu Glu 
110 120 130 140 150 

* . * 

AGT TTT CGC CCC GAA GAA CGT TTT CCA ATG ATG AGC ACT TTT AAA GTT CTG 
ser Phe Arg Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu 
160 170 ISO 190 200 

* 

CTA TGT GGC GCG GTA TTA TCC CGT ATT GAC GCC GGG CAA GAG CAA CTC GGT 
Leu Cys Gly Ala Val Leu Ser Arg lie Asp Ala Gly Gin Glu Gin Leu Gly 

210 220 230 240 250 

* 

CGC CGC ATA CAC TAT TCT CAG AAT GAC TTG GTT GAG TAC TCA CCA GTC ACA 
Arg Arg He His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr 
35 260 270 280 290 300 

* • ♦* ♦ •* 

GAA AAG CAT CTT ACG GAT GGC ATG ACA GTA AGA GAA TTA TGC AGT GCT GCC 
Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala Ala 
310 320 330 340 350 

* 

ATA ACC ATG AGT GAT AAC ACT GCG GCC AAC TTA CTT CTG ACA ACG ATC GGA 
He Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr lie Gly 
360 370 380 390 400 

• * * . * 

GGA CCG AAG GAG CTA ACC GCT TTT TTG CAC AAC ATG GGG GAT CAT GTA ACT 
Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp His Val Thr 
410 420 430 440 450 

* 

CGC CTT GAT CAT TGG GAA CCG GAG CTG AAT GAA GCC ATA CCA AAC GAC GAG 
Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He Pro Asn Asp Glu 
460 470 480 490 500 510 


* 


CGT GAC ACC ACG ATG CCT GTA GCA ATG GCA ACA ACG TTG CGC AAA CTA TTA 
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Arg Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu Leu 
520 530 540 550 560 

* *★* ** ★ ★* * 

ACT GGC GAA CTA CTT ACT CTA GCT TCC CGG CAA CAA TTA ATA GAC TGG ATG 
5 Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu lie Asp Trp Met 
570 580 590 600 610 

**** ♦ * 

GAG GCG GAT AAA GTT GCA GGA CCA CTT CTG CGC TCG GCC CTT CCG GCT GGC 

Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro Ala Gly 

10 620 630 640 650 660 

^ ♦ ** ★ ★* * ** 

TGG TTT ATT GCT GAT AAA TCT GGA GCC GGT GAG CGT GGG TCT CGC GGT ATC 
Trp Phe He Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser Arg Gly He 
670 680 690 700 710 

ATT GCA GCA CTG GGG CCA GAT GGT AAG CCC TCC CGT ATC GTA GTT ATC TAC 

He Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg He Val Val He Tyr 

720 730 740 750 760 

♦ * ★ ** * * ** 

20 ACG ACG GGG AGT CAG GCA ACT ATG GAT GAA CGA AAT AGA CAG ATC GCT GAG 
Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn Arg Gin He Ala Glu 
770 780 790 

★ * * * * 

ATA GGT GCC TCA CTG ATT AAG CAT TGG 
25 He Gly Ala Ser Leu He Lys His Trp 


30 


35 


SEQ.ID NO. 5: range 1 to 766 


10 20 30 40 50 


ATG AAA GAT GAT TTT GCA AAA CTT GAG GAA CAA TTT GAT GCA AAA CTC GGG 

Met Lys Asp Asp Phe Ala Lys Leu Glu Glu Gin Phe Asp Ala Lys Leu Gly 
60 70 80 90 100 

•it * **** ★* * * 

40 ATC TTT GCA TTG GAT ACA GGT ACA AAC CGG ACG GTA GCG TAT CGG CCG GAT 
He Phe Ala Leu Asp Thr Gly Thr Asn Arg Thr Val Ala Tyr Arg Pro Asp 
110 120 130 140 150 

* ★ ** *«* * ** 

GAG CGT TTT GCT TTT GCT TCG ACG ATT AAG GCT TTA ACT GTA GGC GTG CTT 
45 Glu Arg Phe Ala Phe Ala Ser Thr He Lys Ala Leu Thr Val Gly Val Leu 
160 170 180 190 200 

* ********* 

TTG CAA CAG AAA TCA ATA GAA GAT CTG AAC CAG AGA ATA ACA TAT ACA CGT 
Leu Gin Gin Lys Ser He Glu Asp Leu Asn Gin Arg He Thr Tyr Thr Arg 
50 210 220 230 240 250 

** * ** * ** * ** 

GAT GAT CTT GTA AAC TAC AAC CCG ATT ACG GAA AAG CAC GTT GAT ACG GGA 
Asp Asp Leu Val Asn Tyr Asn Pro He Thr Glu Lys His Val Asp Thr Gly 
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260 270 280 290 300 




* 


♦ 

★ 


♦ 


* 

* 


*■ 


* 

* 


♦ 


ATG 

ACG 

CTC 

AAA 

GAG 

CTT 

GCG 

GAT 

GCT 

TCG 

CTT 

CGA 

TAT 

AGT 

GAC 

AAT 

GCG 


Met 

Thr 

Leu 

Lys 

Glu 

Leu 

Ala 

Asp 

Ala 

Ser 

Leu 

Arg 

Tyr 

Ser 

Asp 

Asn 

Ala 

5 

310 



320 



330 


340 



350 






* 


* 


* 

★ 


★ 



* 


* 


* 


GCA 

CAG 

AAT 

CTC 

ATT 

CTT 

AAA 

CAA 

ATT 

GGC 

GGA 

CCT 

GAA 

AGT 

TTG 

AAA 

AAG 


Ala 

Gin 

Ash 

Leu 

He 

Leu 

Lys 

Gin 

He 

Glv 

Gly 

Pro 

Glu 

Ser 

Leu 

Lys 

Lys 


360 


370 



380 



390 


400 



1 n 

J u 

* 




* 

* 




* 



•* 


* 




GAA 

CTG 

AGG 

AAG 

ATT 

GGT 

GAT 

GAG 

GTT 

ACA 

AAT 

ccc 

GAA 

CGA 

TTC 

GAA 

CCA 



Leu 

Arg 

Lys 

He 

Gl V 

Asp 

Glu 

Val 

Thr 

Asn 

Pro 

Glu 

Arg 

Phe 

Glu 

Pro 


410 



420 


430 



440 



450 






* 





* 

-* 


* 


* 

* 


★ 


J 5 

GAG 

TTA 

AAT 

GAA 

GTG 

AAT 

CCG 

GGT 

GAA 

ACT 

CAG 

GAT 

ACC 

AGT 

ACA 

GCA 

AGA 


m n 

lieu 

Asn 

Glu 

V d J. 

Asn 

Pro 

Gly 

Glu 

Thr 

Gin 

Asp 

Thr 

Ser 

Thr 

Ala 

Arq 


4 60 



470 



480 


490 



500 



510 



* 


* 


* 

★ 


♦ 


* 



★ 


* 

♦ 



CTT 

GTC 

ACA 

AGC 

CTT 

CGA 

GCC 

TTT 

GCT 

CTT 

GAA 

GAT 

AAA 

CTT 

CCA 

AGT 

zu 

AI a 

Leu 

Val 

Thr 

Ser 

Leu 

Arg 


Phe 

Ala 

Leu 

Glu 

Asp 

Lys 

Leu 

Pro 

Ser 




520 



530 



540 


550 



560 



* 





-* 


★ 

* 




* 

*■ 


* 


GAA 

AAA 

CGC 

GAG 

CTT 

TTA 

ATC 

GAT 

TOG 

ATG 

AAA 

CGA 

AAT 

ACC 

ACT 

GGA 

GAC 


Vji. u. 

Lys 

Arg 

Glu 

Leu 

Leu 

He 

Asp 

Trp 

Met 

Lys 

Arg 

Asn 

Thr 

Thr 

Gly 

Asp 

Z3 



570 


580 



590 



600 



610 



★ 

* 


* 


♦ 

* 


* 


★ 

* 




•* 


GCC 

TTA 

ATC 

CGT 

GCC 

GGA 

GCG 

GCA 

TCA 

TAT 

GGA 

ACC 

CGG 

AAT 

GAC 

ATT 

GCC 


Ala 

Leu 

He 

Arg 

Ala 

Gly 

Val 

Pro 

Asp 

Gly 

Trp 

Glu 

Val 

Ala 

Asp 

Lys 

Thr 




620 



630 


640 



650 



660 




* 


* 

* 


* 


* 

•* 


* 



* 



ATC 

ATT 

TOG 

CCG 

CCA 

AAA 

GGA 

GAT 

CCT 

GTC 

GGT 

GTG 

CCG 

GAC 

GGT 

TGG 

GAA 


Gly 

Ala 

Ala 

Ser 

Tyr 

Lys 

Gly 

Asp 

Pro 

Val 

Gly 

Thr 

Arg 

Asn 

Asp 

He 

Ala 



670 



68 0 



690 


700 



710 



♦ 


★ 



* 


* 

* 


* 


* 

* 


★ 


35 

GTG 

GCT 

GAT 

AAA 

ACT 

GTT 

CTT 

GCA 

GTA 

TTA 

TCC 

AGC 

AGG 

GAT 

AAA 

AAG 

GAC 


He 

He 

Trp 

Pro 

Pro 

Val 

Leu 

Ala 

Val 

Leu 

Ser 

Ser 

Arg 

Asp 

Lys 

Lys 

Asp 



720 


730 



740 



750 


760 



* 



* 


* 

★ 


* 



* 


* 


* 

★ 


GCC 

AAG 

TAT 

GAT 

GAT 

AAA 

CTT 

ATT 

GCA 

GAG 

GCA 

ACA 

AAG 

GTG 

GTA 

ATG 

AAA 

40 

Ala 

Lys 

Tyr 

Asp 

Asp 

Lys 

Leu 

He 

Ala 

Glu 

Ala 

Thr 

Lys 

Val 

Val 

Met 

Lys- 



770 



780 














★ 





-* 












GCC 

TTA 

AAC 

ATG 

AAC 

GGC 

AAA 












Ala 

Leu 

Asn 

Met 

Asn 

Gly 

Lys 












45 


SDOCtD: <WO 9B13353A1J_> 


wo 98/13353 PCT/US97/1 7395 

We claim: 

1 . A method for identifying proteins or chemicals that directly or indirectly modulate 
a genomic polynucleotide comprising: 

providing a^Iactamase integrated into a non-yeast, eukaryotic genome contained 
5 in at least one living cell, 

contacting said cell with a predetermined concentration of a modulator, and 
detecting BL activity from said cell. 

2. The method of claim 1, wherein said detecting further comprises measuring 
cleavage of a membrane permeant BL substrate, wherein said membrane permeant BL 

10 substrate is transformed in said cell. 

3. The method of claim 2, wherein said membrane permeant BL substrate comprises 
a donor and acceptor. 

4. The method of claim 3, wherein said detecting further comprises measuring FRET 
between said donor and said acceptor. 

15 5. The method of claim 2, wherein said living cell is a mammalian cell. 

6. The method of claim 5, wherein said BL expression construct randomly integrates 
into said genome. 

7. The method of claim 6, wherein said living cell is contacted with said modulator 
prior to inserting of said BL expression construct in said non-yeast, eukaryotic genome 

20 and further comprising the step of determining the coding nucleic acid sequence of a 
polynucleotide operably linked to said BL expression construct, wherein said construct 
comprises a splice donor, a splice acceptor and an IRES element. 

8. The method of claim 5, wherein said BL expression construct encodes cytosolic 
BL and said cell comprises a receptor that is known to bind said modulator. 

25 9. The method of claim 8, wherein said receptor is a nuclear receptor heterologously 
expressed by said cell. 

10. The method of claim 8, wherein said receptor has a transmembrane domain and is 
homologously expressed by said cell. 

1 1. The method of claim 10, wherein said modulator is a non-peptide. 
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12. The method of claim 8, wherein said cell is contacted with a predetermined 
concentration of a second modulator and detecting BL activity before and after contacting 
said cell with said second modulator. 

13. The method of claim 5, wherein said cell comprises an orphan protein 
heterologously expressed by said cell. 

14. The method of claim 5, wherein said BL activity is increased in the presence of 
said modulator compared with the ^lacatamase activity in the absence of said modulator. 

15. The method of claim 5, wherein said modulator is known to bind to a receptor 
expressed by said cell and said BL activity in said cell is increased in the presence of said 
modulator compared to the BL activity detected from a conrespondmg cell in the presence 
of said modulator, wherein said corresponding cell does not express of said receptor. 

16. A method of identifying active genomic polynucleotides, comprising: 
contacting living cells with a membrane permeant BL substrate, and 
sorting living cells by fluorescence, 

wherein said cells are eukaryotic cells and comprise a genome having a stably 
integrated BL expression construct and said fluorescence indicates BL activity. 

17. The method of claim 16, wherein said sorting further comprises measuring 
cleavage of a membrane permeant BL substrate by fluorescence spectroscopy in a FACS, 
wherein said membrane permeant BL substrate is transformed in said cell. 

20 18. The method of claim 1 7, wherein said membrane permeant substrate BL substrate 
has a donor and acceptor and said measuring further comprises measuring FRET between 
a donor and an acceptor. 

19. The method of claim 17, wherein said sorting further comprises separating said 
cells without BL activity firom said cells with BL activity. 

20. The method of claim 19, wherein said cells are contacted with only a cell culture 
medium in the absence of a test chemical. 

2 1 . The method of claim 20, wherein said cells without BL activity are contacted with 
a test chemical and further sorted by fluorescence for BL activity. 

22. ' The method of claim 21, wherein said test chemical is an agonist. 


15 


25 
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23. The method of claim 21, wherein said test chemical is an antagonist. 

24. The method of claim 21, wherein said cells without BL activity are contacted with 
a test chemical and further sorted by fluorescence for BL activity. 

25. The method of claim 22, wherein said cells with BL activity are contacted with an 
5 antagonist and further sorted by fluorescence for BL activity. 

26. The method of claim 17, wherein said cells express an identified receptor that 
binds a modulator known to bind to said identified receptor. 

27. The method of claim 26, wherein said living cells comprise a heterologous G- 
protein. 

10 28. The method of claim 17, wherein said living cells comprise a heterologous protein 
having a membrane domain. 

29. A composition of matter comprising a non-yeast, eukaryotic ceil having a genome 
with a stably integrated BL expression construct comprising a polynucleotide encoding a 
protein having BL activity, an IRES element, a splice donor site and a splice acceptor 

15 site. 

30. The composition of matter of claim 29, further comprising a heterologous protein 
expressed in said cell. 

3 1 . The composition of matter of claim 30, wherein said cell is a mammalian cell. 

32. The composition of matter of claim 31, wherein said polynucleotide contains 
20 nucleic acid sequences that are preferred by said mammalian cell for expression, 

33. The composition of matter of claim 32, wherein said cell further comprises a 
membrane permeant BL substrate, wherein said membrane permeant BL substrate is 
transformed inside said cell by intracellular esterases. 

34. The composition of matter of claim 33, wherein said polynucleotide encodes a 
25 cytosolic BL. 

35. A method of screening compounds with an active genomic polynucleotide, 
comprising: 

1) optionally contacting a multiclonal population of cells with a first test chemical 
prior to separating said cells by a FACS, 
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2) separating by a FACS said mulliclonal population of cells into BL expressing 
cells and non-BL expressing cells, wherein said BL expressing cells have a detectable 
difference m cellular fluorescence properties compared to non-BL expressing cells, and 
Ai) contacting said non-BL expressing cells with a second test chemical, and 
Aii) sorting by a FACS said non-BL expressing cells into a) second test chemical 
activated cells and b) second test chemical non-activated cells, 
wherein said second test chemical activated cells have BL activity detectable by a 
FACS and said second test chemical non-activated cells have no BL activity detectable 
by FACS, or 

Bi) contacting said BL expressing cells with a third test chemical, and 
Bii) sorting by a FACS said BL expressing cells into a) third test chemical 
activated cells and b) third test chemical non-activated cells, 
wherein said third test chemical activated cells have BL activity detectable by a 
FACS and said third test chemical non-activated cells have no BL activity detectable by 
15 FACS, 

wherein said multiclonal population of cells comprises eukaryotic cells having a 
BL expression constmct integrated into a genome of said eukaryotic cells and a 
membrane permanent BL substrate transformed inside said cells to a membrane 
impermeant BL substrate. 

36. The method of claim 35, wherein said BL activity is measured by FRET. 

37. The method of claim 35, wherein said steps of Ai and Aii or Bi and Bii are 

repeated. 

38. The method ofclaim 35, wherein said second test chemical activated cells are 
washed, then contacted with a modulator in the presence of said second test chemical and 

25 tested for BL activity. 

39. The method ofclaim 38, wherein said modulator is present in a concentration of 

10 *M or less. 

40. The method ofclaim 35, wherein said eukaryotic cells express a heterologous 
protein. 


20 
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41. A method for identifying an expressed protein that directly or indirectly 
modulates a genomic polynucleotide, comprising: providing at least one living non- 
yeast, eukaryotic cell comprising a^lactamase polynucleotide that can be under 
transcriptional control of said at least one living non-yeast, eukaryotic cell's genome and 

5 stably integrated into a genomic polynucleotide site, contacting said cell with a 

predetermined concentration of a knov^n modulator, and detecting p-lactamase activity 
from said at least one living non-yeast, eukaryotic cell; wherein said at least one living 
non-yeast, eukaryotic cell expresses a heterologous protein and said known modulator 
increases or decreases the expression of said^lactamase polynucleotide in the presence of 
10 said heterologous protein. 

42. The method of claim 41, wherein said detecting further comprises measuring 
cleavage of a membrane permeant p-lactamase substrate, wherein said membrane 
permeant P-lactamase substrate is transformed in said at least one living non-yeast, 
eukaryotic cell. 

15 43. The method of claim 42, wherein said membrane permeant P-lactamase substrate 
has a donor and acceptor in said at least one living non-yeast, eukaryotic cell. 

44, The method of claim 3, wherein said method further comprises sorting a 
population of cells with a FACS, 

45. The method of claim 41, wherein said cell is a mammalian cell. 

20 46. The method of claim 45, wherein said ^lactamase polynucleotide includes a P- 
lactamase expression construct for random integration into said genome. 
47. The method of claim 46, further comprising the step of determining a portion of 
the coding nucleic acid sequence of a polynucleotide operably linked to said p-lactamase 
expression construct. 

25 48. The method of claim 45, wherein said BL expression construct comprises 

cytosolic p-lactamase, said construct comprises a splice donor, a splice acceptor and an 
IRES element and said cell comprises a receptor that is known to bind said known 
modulator. 
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49. The method of claim 45, wherein said hetereologous protein is selected from the 
group consisting of hormone receptors, intracellular receptors, receptors of the cytokine 
superfamily, G-protein coupled receptors, heterologous G-proteins, neurotransmitter 
receptors, and tyrosine kinase receptors. 
5 50. The method of claim 45, wherein said hetereologous protein has a transmembrane 

domain. 

51. The method of claim 50, further comprising over expressing said heterologous 
protein. 

52. The method of claim 45, wherein said at least one living non-yeast, eukaryolic 

10 cell is contacted with a predetermined concentration of a second modulator and detecting 
p-lactamase activity after contacting said cell with said known modula;tor. 

53. The method of claim 45, wherein said cell comprises an orphan protein 
heterologously expressed by said at least one living non-yeast, eukaryotic cell. 

54. The method of claim 45, wherein said p-lactamase activity is increased in the 
15 presence of said modulator compared to the absence of said modulator. 

55. The method of claim 45, wherein said known modulator is known to bind to a 
receptor and said p-lactamase activity in said at least one living non-yeast, eukaryotic 
cell is increased in the presence of said modulator compared to the p-lactamase activity 
detected from a corresponding cell in the presence of said known modulator, wherein said 

20 corresponding cell does not express said heterologous protein. 

56. A method for identifying modulators, comprising: 

a) contacting at least one living mammalian cell with a test chemical at a 
predetermined concentration and a known modulator at a predetermined 
concentration, wherein said at least one living mammalian cell comprises aPr 

25 lactamase polynucleotide that can be under transcriptional control of said at least 

one living mammalian cell's genome and stably integrated into a genomic 
polynucleotide site, and 

b) detecting expression of said.^lactamase polynucleotide by said at least one 
living mammalian cell, wherein said known modulator increases or decreases 
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expression of said ^lactamase polynucleotide located at said genomic 
polynucleotide site. 

57. The method of claim 56, wherein said test chemical changes expression of sai^- 
lactamase polynucleotide by said known modulator. 
5 58. The method of claim 56, wherein said^lactamase polynucleotide further 
comprises a splice acceptor site. 

59. The method of claim 58, wherein said|^lactamase polynucleotide further 
comprises an IRES. 

60. The method of claim 57, wherein said test chemical or known modulator is 
10 provided at a concentration less than about IjiM. 

61. The method of claim 56, further comprising separating a population of living 
mammalian cells into 1) a population of living mammalian cells that expresses^- 
lactamase and 2) a population of living mammalian cells that does not express^- 
lactamase. 

15 62. The method of claim 61, wherein said separating further comprises measuring 
cleavage of a membrane permeant J^lactamase substrate in said population of living 
mammalian cells by fluorescence spectroscopy in a FACS, wherein the fluorescence of 
said membrane permeant^lactamase substrate is transformed by^-lactamase in at least 
one living mammalian cell. 

20 63. The method of claim 56, wherein said known modulator modulates a receptor 
selected from the group consisting of intracellular receptors and G-protein coupled 
receptors. 

64. The method of claim 63, wherein said known modulator is an agonist, 

65. The method of claim 63, wherein said known modulator is an antagonist. 

25 66. The method of claim 64, wherein said known modulator is contacted with said at 
least one living mammalian cell prior to contacting said test chemical with said at least 
one living mammalian cell, 

67. The method of claim 56, wherein said test chemical is a modulator for a protein 
selected from the group consisting of hormone receptors, intracellular receptors, receptors 
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of the cytokine superfamily, G-protein coupled receptors, heterologous G-proteins, 
neurotransmitter receptors, and tyrosine kinase receptors. 

68. The method of claim 56, wherein said at least one living mammalian cell further 
comprises a heterologous ly expressed protein selected from the group consisting of 

5 hormone receptors, intracellular receptors, signaling molecules, receptors of the cytokine 
superfamily, G-protein coupled receptors, heterologous G-proteins, neurotransmitters, 
and tyrosine kinase receptors. 

69. The method of claim 68, wherein said heterologously expressed protein is a G- 
protein coupled receptor or a heterologous G-protein. 

10 70. The method of claim 56, further comprising the step of activating said at least one 
living mammalian cell with a G-protein coupled receptor modulator. 

71 . The method of claim 70, wherein said at least one living mammalian cell further 
comprises an orphan receptor 

72. The method of claim 56, wherein said at least one living mammalian cell is of cell 
15 type from a panel of different cell types and steps (a) and (b) are performed on each cell 

type. 

73. The method of claim 56, wherein said genomic polynucleotide site is part of a 
gene not known to be modulated by said known modulator. 

74. The method of claim 73, wherein said known modulator is as an agonist. 
20 75. The method of claim 74, wherein said test chemical is an antagonist. 

76. The method of claim 73, wherein said known modulator is an antagonist. 

77. The method of claim 76, wherein said test chemical is an agonist. 

78. A method for identifying a modulator, comprising: 

a) contacting a population of non-yeast, eukaryotic cells with a test chemical and 
25 a known modulator, wherein said population of non-yeast, eukaryotic cells 

comprises a genome with a stably integrate<i&-lactamase expression construct, 
comprising: 

1) a polynucleotide encoding a protein havingft- lactamase activity, and 

2) a splice acceptor site; and 
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b) detecting the activity of said^-lactamase polynucleotide expressed by said 
population of non-yeast, eukaryotic cells, wherein said known modulator 
increases or decreases the expression of said polynucleotide encoding a protein 
having^lactamase activity, and said known modulator modulates a biological 
5 process or target. 

79. The method of claim 78, wherein said ^lactamase expression construct further 
comprises a splice donor site. 

80. The method of claim 79, wherein said^lactamase expression construct further 
comprises an IRES element. 

10 81. The method of claim 78, wherein said population of non-yeast, eukaryotic cells 
further comprises an expressed heterologous G-protein coupled receptor. 

82. The method of claim 81, wherein said population of non-yeast, eukaryotic cells 
further comprises an orphan G-protein coupled receptor. 

83. A method for identifying a ligand of a target, comprising: contacting a eukaryotic 
15 cell with a test chemical at a predetermined concentration, wherein said eukaryotic cell 

comprises 1) a genomic polynucleotide with a p-lactamase expression construct under 
expression control by a first polynucleotide in said genomic polynucleotide and 2) a 
target that does not nonnally modulate transcription of a gene product under expression 
control of said first polynucleotide with proviso that said target can directly or indirectly 
20 alter expression of said p-lactamase expression construct under expression control by said 
first polynucleotide, and 

determining expression of said (J-lactamase expression construct. 

84. The method of claim 83, wherein said eurkaryotic cell is a mammalian cell. 

85. The method of claim 84, wherein said target is a heterologously expressed protein. 
25 86. The method of claim 85, wherein said heterologously expressed protein is a 

membrane protein. 

87. The method of claim 84, wherein said heterologously expressed protein is a GPCR. 

88. The method of claim 84, wherein said heterologously expressed protein is an ion 
channel. 
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89. The method of claim 84, further comprising contacting a eukaryotic cell with a test 
chemical at a predetermined concentration, wherein said eukaryotic cell comprises 1) a 
genomic polynucleotide with a P-lactamase expression construct under expression control 
by a first polynucleotide in said genomic polynucleotide and 2) a target that does not 

5 normally modulate transcription of a gene product under expression control of said first 
polynucleotide. 

90. The method of claim 84, wherein said gene product is normally expressed in a first 
tissue and said target is normally expressed in a second tissue, wherein said first tissue is 
of a different embryonic origin than said second tissue. 

10 91 . The method of claim 84, wherein said gene product is normally expressed in a first 
cell in vivo and said target is normally expressed in a second cell in vivo, wherein said 
first cell is a different cell type than said second cell. 

92. The method of claim 84, wherein expression of said gene product is normally 
repressed and said target does not increase expression of said gene product in vivo in 

15 naturally occurring cells. 

93. The method of claim 84, wherein said gene product is normally expressed in a first 
cell in vivo and said target is normally expressed in a second cell in vivo, wherein said 
first cell is a different cell type than said second cell. 

94. The method of claim 84, wherein expression of said gene product in said eurkaryotic 
20 cell is not detectable in the absence of said target and said eurkaryotic cell does not 

express detectable levels of protein of said target in the absence of heterologous 
expression of said target. 

95. The method of claim 84, wherein native protein of said gene product and native 
protein of said target are not expressed in detectable levels in a single, naturally occurring- 

25 cell. 

96. The method of claim 84, wherein native protein of said target in a naturally occurring 
cell does not modulate expression of native protein of said gene product in said naturally 
occurring cell. 

97. A method for identifying a cellular function of an orphan protein, comprising: 
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contacting a eukaryotic cell with a test chemical at a predetermined concentration, 
wherein said eukaryotic cell comprises 1) a genomic polynucleotide with a P-lactamase 
expression construct under expression control by a first polynucleotide in said genomic 
polynucleotide and 2) an orphan protein, 
5 determining expression of said P-lactamase expression construct, and 

identifying the function of said genomic polynucleotide with said P-lactamase 
expression construct or its corresponding gene where said P-lactamase expression 
construct has integrated. 

98. The method of claim 97, wherein said eukaryotic cell is a mammalian cell. 
10 99. The method of claim 98, wherein said orphan is a heterologously expressed protein. 

100. The method of claim 99, wherein said heterologously expressed orphan protein 
has putative transmembrane domain. 

101. The method of claim 98, wherein said heterologously expressed orphan protein is 
homologous to a GPCR of known function and is overexpressed. 

15 102. A method for identifying a modulator of an orphan protein, comprising: 

contacting a eukaryotic cell with a test chemical at a predetermined concentration, 
wherein said eukaryotic cell comprises 1) a genomic polynucleotide with a p-lactamase 
expression construct under expression control by a first polynucleotide in said genomic 
polynucleotide and 2) a orphan protein that modulates expression of said p-lactamase 
20 expression construct, and 

determining expression of said p-lactamase expression construct. 

103. The method of claim 102, wherein said eukaryotic cell is a mammalian cell 

104. The method of claim 103, wherein said orphan protein is a heterologously 
expressed protein. 

25 1 05. The method of claim 1 02, wherein said heterologously expressed orphan protein 
has putative transmembrane domain. 

106. The method of claim 102, wherein said heterologously expressed orphan protein 
is over expressed and is homologous to a GPCR of known function. 

107. A method for identifying intracellular pathways, comprising: 
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expressing a protein of interest in a plurality of eukaryotic cells, wherein each 
eukaryotic cell comprises a genomic polynucleotide with a P-lactamase expression 
construct under expression control by a polynucleotide in said genomic polynucleotide, 
and said plurality of cells has a plurality of integration sites where said (J-lactamase 
5 expression construct has integrated into said genome of each said eukaryotic cell, 

optionally contacting said plurahty of eukaryotic cells with a ligand of said 
protein of interest, 

determining expression from said ^-lactamase expression construct, and 

identifying said polynucleotide if said expressing of said protein of interest alters 
10 expression from said P-lactamase expression construct or if said contacting said ligand of 
said protein of interest alters expression from said P-lactamase expression construct, 

wherein alteration of said expression from said P-lactamase expression construct 
indicates participation of said protein of interest in an intracellular signaling pathway. 
108. The method of claim 107, wherein said eukaryotic cell is a mammalian cell. 
15 109. The method of claim 108, wherein said protein of interest is a heterologously 
expressed protein and has a known ligand. 

1 10. The method of claim 108, wherein said protein of interest is a heterologously 
expressed protein and has no known ligand, 

111. The method of claim 109, further comprising isolating a eurkaryotic cell from said 
20 plurality of eukaryotic cells and characterizing said polynucleotide. 

112. The method of claim 109, wherein each said eurkaryotic cell in said plurality of 
eukaryotic cells is an isolated, clonal population of cells. 

113. The method of claim 112, wherein said said plurality of cells comprises at least 
10,000 isolated clonal populations of cells. 

25 114. A method for determining a cellular response profile for a target, comprising: 
expressing a protein of interest in a plurality of eukaryotic cells, wherein each 
eukaryotic cell comprises a genomic polynucleotide with a P-lactamase expression 
construct under expression control by a polynucleotide in said genomic polynucleotide. 


SDOCID: <WO 9813353A1_I_> 


wo 98/13353 PCT/US97/I7395 

and said plurality of cells has a plurality of integration sites where said fi-lactamase 
expression construct has integrated into said genonne of each said eukaryotic cell, 

optionally contacting said plurality of eukaryotic cells with a ligand of said 
protein of interest, 

5 determining expression from said p-lactamase expression constructs, and 

identifying plurality of said polynucleotides exhibiting a increase, decrease or no 
change in expression from said ^-lactamase expression that results from either said 
expressing of said protein of interest or said contacting of said ligand, 

wherein an increase, decrease or no change in expression of each said 
10 polynucleotide from said plurality of polynucleotides indicates a profile of cellular 
response relating to said protein of interest. 

115. A method for determining a cellular response profile for a chemical, comprising: 

expressing a protein of interest in a plurality of eukaryotic cells, wherein each 
eukaryotic cell comprises a genomic polynucleotide with a [J-lactamase expression 
15 construct under expression control by a polynucleotide in said genomic polynucleotide, 
and said plurality of cells has a plurality of integration sites where said p-lactamase 
expression constmct has integrated into said genome of each said eukaryotic cell, 

optionally contacting said plurality of eukaryotic cells with a ligand of said 
protein of interest, 

20 contacting said plurality of eukaryotic cells with a test chemical at a 

predetermined concentration, and 

detemiining expression from said fi-lactamase expression constructs, and 
identifying plurality of said polynucleotides exhibiting a increase, decrease or no 
change in expression from said P-Iactamase expression that results from either said 
25 expressing of said protein of interest or said contacting of said ligand, 

wherein an increase, decrease or no change in expression of each said 
polynucleotide from said plurality of polynucleotides indicates a profile of cellular 
response relating to said test chemical. 

116. A method for identifying a modulator of a viral component, comprising: 
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contacting a eukaryotic cell with a test chemical at a predetermined concentration, 
wherein said eukaryotic cell comprises 1) a genomic polynucleotide with a p-lactamase 
expression construct under expression control by a first polynucleotide in said genomic 
polynucleotide and 2) a viral component is not previously known to modulate 
5 transcription of a gene product under expression control of said first polynucleotide and 
said viral component is not an oncogene or proto-oncogene or protein product thereof, 
and 

determining expression of said p-lactamase expression construct. 

117. The method of claim 116, wherein said viral component is selected fi-om the list 
10 consisting of a virus, a capsule, a viral polynucleotide, or a viral protein. 

1 18. The method of claim 1 1 7, further comprising contacting a second eukaryotic cell 
with said test chemical at a predetermined concentration, wherein said eukaryotic cell 
comprises 1) a second genomic polynucleotide with a P-lactamase expression construct 
under expression control by a second polynucleotide in said second genomic 

15 polynucleotide and 2) said viral component, and 

determining expression of said P-lactamase expression construct,wherein said 
viral component is selected fi-om the list consisting of a virus, a capsule, a viral 
polynucleotide, or a viral protein. 

1 19. The method of claim 1 1 8, wherein said second eukaryotic cell is ft-om a 
20 population of eukaryotic cells, each said eukaryotic cell comprising 1) a genomic 

polynucleotide with a p-lactamase expression construct and 2) said viral component. 

120. A method for identifying a cellular fiinction of a viral component, comprising: 
contacting a eukaryotic cell with a viral component at a predetermined 

concentration or expressing a viral component in said eukaryotic cell, wherein said 
25 eukaryotic cell comprises 1) a genomic polynucleotide with a p-lactamase expression 
construct under expression control by a first polynucleotide in said genomic 
polynucleotide, 

optionally contacting said eucaryotic cell with a second viral component of a virus 
that is different from said vimal component. 
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determining expression of said p-lactamase expression construct, and 
identifying the function of said genomic polynucleotide with said P-lactamase 
expression construct or gene where said P-lactamase expression construct has integrated. 

121. A method for identifying a chemical that modulates a physiological response or 
5 cellular pathway, comprising: 

contacting a eukaryotic cell with a test chemical at a predetermined concentration, 
wherein said eukaryotic cell comprises 1) a genomic polynucleotide with a P-lactamase 
expression construct under expression control by a first polynucleotide in said genomic 
polynucleotide, wherein said cell is characterized as comprising a physiological response 
10 of interest or a cellular pathway of interest, and 

contacting said eukaryotic cell with a signal molecule, and 
determining expression of said P-lactamase expression construct. 

122. The method of claim 121, said signal molecule is a naturally occurring molecule 
that binds to the outside of said eukaryotic cell and said eukaryotic cell is a mammalian 

15 cell. 

123. The method of claim 122, said physiological response occurs in vivo in an cell 
selected from the group consisting of a nerve cell, cardiac cell, epithelial cell, muscle cell, 
endocrine cell, paracrine cell, blood cell, and connective tissue cell. 

124. The method of claim 121, wherein said signal molecule increases expression. 
20 125. The method of claim 124, wherein said polynucleotide has a gene product that 

does not alter said cellular pathway or physiological response. 

126, A chemical identified by any of the above methods for identifying useful 
chemicals. 

127. A method for identifying and developing a drug, comprising: 

25 1) contacting a population of non-yeast, eukaryotic cells with a test chemical and 

a known modulator, wherein said population of non-yeast, eukaryotic cells 
comprises a genome with a stably integratec^lactamase expression construct, 
comprising: 

a) a polynucleotide encoding a protein having^lactamase activity, and 
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b) a splice acceptor site; and 

2) detecting expression of said ^lactamase polynucleotide expressed by said 
population of non-yeast, eukaryotic cells, wherein said known modulator 
increases or decreases the expression of said polynucleotide encoding a 

5 protein having^ lactamase activity, and said known modulator modulates a 

biological process or target, 

3) determining whether said test chemical alters expression of said -lactamase 
polynucleotide, 

4) optionally testing for toxic effects of said test chemical in a cell-based assay, 
10 5) optionally generating a second test chemical based on the structure-property 

relationships of said test chemical, 

6) optionally determining whether said second test chemical alters expression of 
said ^-lactamase polynucleotide, 

7) testing for toxic effects of said test chemical or said second test chemical in a 
15 mammal, and 

8) testing for therapeutic effects of said test chemical or said second test 
chemical in a mammal. 

128. A drug chemical identified and developed by the following method, comprising: 
1) contacting a population of non-yeast, eukaryotic cells with a test chemical and 
20 a known modulator, wherein said population of non-yeast, eukaryotic cells 

comprises a genome with a stably integrated lactamase expression construct, 
comprising: 

a) a polynucleotide encoding a protein having^lactamase activity, and 

b) a splice acceptor site; and 

25 2) detecting expression of said ^lactamase polynucleotide expressed by said 

population of non-yeast, eukaryotic cells, wherein said known modulator 
increases or decreases the expression of said polynucleotide encoding a protein 
having ^lactamase activity, and said known modulator modulates a biological 
process or target. 
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3) determining whether said test chemical alters expression of said^lactamase 
polynucleotide, 

4) optionally testing for toxic effects of said test chemical in a cell-based assay, 

5) optionally generating a second test chemical based on the structure-property 
relationships of said test chemical, 

6) optionally determining whether said second test chemical alters expression of 


said^lactamase polynucleotide, 
7) testing for toxic effects of said test chemical or said second test chemical in a 


mammal, and 


10 


8) 


testing for therapeutic effects of said test chemical or said second test 
chemical in a mammal. 


129. The drug of claim 128, wherein said drug can be used to treat a medical condition 
selected from the group consisting of immune response, cardiac disfunctions and disease, 
vascular disfunctions and diseases, neural disfunctions and disease, endocrine 

15 disfunctions and disease, gastro-intestinal disfunctions and disease, obesity, diabetes, 
inflammation disfunctions and disease, cancer and trauma. 

130. A pharmaceutical composition, comprising a therapeutic agent and a 
pharmaceutically acceptable carrier. 

131. The pharmaceutical composition of claim 130, said therapeutic agent having the 
20 structure of Chemical A or B and said pharmaceutically acceptable carrier is selected for 

treating undesired T-cell activation or an undesired immune response. 
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